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Chimeras of Hepatitis C Virus and Bovine Viral Diarrhea Virus 

Reference to Government Grant 

This invention was made with government support under a grant from the National 
Institutes of Health, grant numbers PHS CAS 7973 and AI40034. The government has certain 
rights in this invention. 

5 

Related Applications 

This application claims priority to, and incorporates herein in its entirety, U.S. 
60/082,964 filed April 24, 1998. 

10 Background of the Invention 

(1) Field of the Invention 

This invention relates generally to the development of tfierapies for treating hepatitis 
C virus (HCV) and bovine viral diarrhea virus (BVDV) and more particularly to the 
identification of such therapies using chimeric viruses comprising a genomic sequence 
1 5 derived from HCV and bovine viral diarrhea virus (BVDV). 

(2) Description of the Related Art 

The Flavivirdae is an important family of human and animal RNA viral pathogens 
(Rice, CM. 1996. Flavivirdae: The viruses and their replication. In: Fields BN, Knipe DM, 
Howley PM., eds. Fields virology, Philadelphia: Lippincott-Raven Publishers, pp. 931-960.) 

20 The three currently recognized genera of the Flavivirdae family exhibit distinct differences in 
transmission, host range, and pathogenesis. For example, members of the classical flavivirus 
genus, such as yellow fever virus and dengue virus, are typically transmitted to vertebrate 
hosts via arthropod vectors and cause acute self-limiting disease (Monath TP, Heinz FX. 
1996. Flaviviruses. In: Fields BN, Knipe DM, Howley PM., eds. Fields virology. New York: 

25 Raven Press, pp. 961-1034). The pestiviruses, such as bovine viral diarrhea virus (BVDV) 
and classical swine fever virus (CSFV), cause economically important livestock disease and 
are spread by direct contact or the fecal-oral route (Thiel et al., 1996. Pestiviruses. In: Fields 
BN, Knipe DM, Howley PM., eds. Fields virology. New York: Raven Press, pp. 1059-1073). 
The most recently characterized Flavivirdae genus is the hepacivirus genus, the sole member 

30 of which is the common and exclusively human pathogen, hepatitis C virus (HCV). HCV is 
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transmitted by contaminated blood or blood products and is the most common agent of non- 
A, non-B hepatitis, affecting more that 1% of the population worldwide (Houghton, 1996. 
Hepatitis C viruses. In: Fields BN, Knipe DM, Howley PM., eds. Fields virology, 
Philadelphia: Lippincott-Raven PubHshers. pp. 1035-1058.). Unlike flavivirus and pestivirus 
5 infections, which are usually eliminated by host immune response, chronic HCV infections 
are common and can cause mild to severe liver disease including cancer. 

Despite these differences, members of the Flavivirdae family share common 
structural features and gene expression strategies. Virus particles consist of a lipid bilayer 
envelope with embedded transmembrane glycoproteins surrounding a protein-RNA 

10 nucleocapsid. Genome RNAs are single-stranded of positive polarity, and function as the sole 
mRNA species for translation of a single long open reading frame (ORF). This ORF is 
translated into a polyprotein which is processed by cellular and viral proteases into mature 
viral proteins. Structural proteins destined for incorporation into virus particles are encoded 
in the N-terminal portion of the polyprotein, while the nonstructural proteins which form 

1 5 components of the viral RNA replicase are encoded in the remainder. 

Replication of the Flavivirdae RNA genome occurs via synthesis of a full-length 
negative-strand intermediate and is asymmetric, favoring synthesis of positive-strand RNAs. 
However, little is known about the details of this process. For all three genera of the 
Flavivirdae family, full-length functional cDNA clones have been constructed and RNAs 

20 transcribed from these cDNA templates are infectious. For flaviviruses and pestiviruses, 
mutagenesis of these clones and efficient RNA transfection of permissive cell cultures 
provides a means of probing the role of cis RNA elements and viral proteins in replicase 
assembly and function. Such analyses are not yet possible for HCV since this virus is unable 
to replicate efficiently in cell culture. 

25 Like many other RNA viruses, it is believed the 5* and 3' terminal sequences of the 

Flavivirdae contain conserved cw-elements important for translation, RNA replication, and 
packaging (Bukh et al., Proc. Natl. Acad, Sci. USA 59:4942-4946, 1992; Deng et al.. Nucleic 
Acids Res. 27:1949-1957, 1993; Cahour et al., Virol. 2t?7:68-76, 1995; Kolykhalov et al., J. 
Virol 70:3363-3371, 1996; Men et al., J. Virol 70:3930-3937, 1996; Tanaka et al., J. Virol 

30 70:3307-33 12, 1996; Huang HV. 1997. Evolution of the alphavirus promoter and the ex- 
acting sequences of RNA viruses. In: Saluzzo J-F, Dodet B. eds. Factors in the emergence of 
arbovirus disesases, Paris: Elsevier Press, pp. 65-79; Mandl et al., J. Virol 72:2132-2140, 
1998). The 5' nontranslated region (NTR) functions initially at the level of translation. 
Similar to most cellular mRNAs, flavivirus genome RNAs are translated in a cap-dependent 

35 manner. These RNAs contain a 5' cap structure that is presumably added by virus-encoded 
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3 ... 
RNA triphosphatases, guanylyl-, and methyl-transferases (Rice, 1996, supra). In contrast, the 
translational strategy employed by pestiviruses and HCV is more similar to that of the 
picomaviruses. These RNAs appear to be uncapped and contain long 5' NTRs with cis RNA 
elements that function as internal ribosome entry sites (IRES) for translation initiation at the 
5 polyprotein AUG (Lemon et al, Semin. Virol 5:274-288, 1997). 

The 5' NTRs of HCV and BVDV have a similar structural and functional organization 
despite containing only short stretches of high sequence identity (Wang et al., Curr, Top, 
Microbiol Immunol 205:99-1 15, 1995; Lemon et al., 1997, supra). The IRES within each 
NTR is located at the 3' end of the NTR at a position proximal to the AUG initiation codon of 

10 the ORF. Although the 5' terminal sequence of each of these viruses is apparently not 
required for IRES function (Rijnbrand et al., FEES Lett 365:1 15-1 19, 1995; Honda et al, 
Virology. 222:31-42, 1996; Rijnbrand etal.,/. Virol 77:451-457, 1997), these sequences arc 
highly conserved among different strains of HCV (Bukh et al.. Proa Natl Acad, ScL 
USA:89:4942-4946, 1992) or BVDV (Deng et al., 1993, supra), suggesting they play other 

15 roles in viral replication. For example, sequences in the 5' NTR may be required for 

regulating translation versus initiation of negative-strand RNA synthesis. Such regulation 
could occur by direct interaction of 5* and 3' RNA elements or indirectly, via RNA-protein 
interactions. Sequences in the 5' NTR may also modulate packaging versus translation. 
Finally, sequences complementary to the 5' NTR, which arc located at the 3' end of negativc- 

20 strand RNA, are likely to function in the initiation of positive-strand RNA synthesis. 

The HCV 3 ' NTR contains an internal polypyrimidine tract followed by a highly 
conserved sequence of 98 bases at the 3 ' terminus, which has been shown to be required for 
replication of HCV (U.S. Application Serial No. 08/81 1,566). 

Further elucidation of the role of sequences in the HCV 5 ' and 3 ' NTRs has been 

25 hampered by the inefficient rephcation of HCV in cell culture. This aspect of HCV biology 
also makes it difiGcult to identify and test possible antiviral compounds for activity against 
HCV. Thus, a need exists for a system which facilitates investigation of HCV replication and 
therapeutic approaches to control HCV infections. 

30 Sunmiary of the Invention 

Briefly, therefore, the present invention provides novel compositions and methods for 
studying HCV replication which are based on the discovery that chimeras of HCV and BVDV 
genomic sequences can be constructed that are able to replicate in cell culture. The BVDV- 
specific sequence provides the chimeric viral nucleic acid with the abiHty to replicate in cell 

35 culture, while the HCV-specific sequence allows the chimeric viral nucleic acid to be used to 
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screen possible compounds for anti-viral activity against HCV. It is believed that similar 
replication-competent chimeras can be constructed from HCV and other pestiviruses. 

Thus, in one embodiment, the present invention provides a novel, chimeric viral RNA 
in which at least one of the 5 ' NTR; ORF and 3 ' NTR regions is chimeric and comprises a 
5 nucleotide sequence from the corresponding region of a pestivirus in operable linkage with a 
nucleotide sequence from the corresponding region of an hepatitis C virus (HCV). The 
chimeric viral RNA is replication-competent. In preferred embodiments, the pestivirus is 
BVDV. 

In other embodiments, the invention provides a polynucleotide comprising a DNA- 

10 dependent promoter operably linked to a cDNA of a chimeric viral RNA as described above 
and cells transiently transfected or stably transformed with the polynucleotide. In some 
embodiments the cDNA may encode a dominant selectable marker or an assayable reporter. 

In yet another embodiment, the invention provides a method for identifying 
compounds having anti-HCV activity. The method comprises providing a first cell containing 

15 a chimeric viral nucleic acid derived from HCV and a pestivirus as described above and a 
second cell containing the pestivirus, and then comparing the replication efficiency of the 
chimeric viral nucleic acid in the presence and absence of a test compoimd to the replication 
efficiency of the pestivirus in the presence and absence of the test compound, 
wherein a greater reduction in compound-induced replication efficiency of the chimeric Adral 

20 nucleic acid than the pestivirus indicates the compound has anti-HCV activity. 

The invention also provides a genetically-engineered virus which comprises a 
chimeric viral nucleic acid derived from HCV and a pestivirus as described above. In one 
embodiment the genetically-engineered virus comprises virus particles containing at least one 
HCV structural protein and is usefril in a vaccine against HCV. In another embodiment, the 

25 genetically-engineered virus is attenuated as compared to the pestivirus and is usefril as a 
vaccine against die pestivirus. 

In a still frirther embodiment, the invention provides a replication-competent BVDV 
vector expressing a heterologous sequence. The BVDV vector comprises the BVDV 
sequences encoding the BVDV replication machinery. In some embodiments, the replication- 

30 competent BVDV vector expresses an antigen and is usefril as a vaccine. 

Brief Description of the Drawings 

Figure 1 is a schematic representation of the 5' NTRs of BVDV, HCV, and EMCV 
showing the position of the start codons of the ORF, and the boxes indicating the canonical 
35 IRES elements. 
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Figure 2 shows a schematic representation of BVDV and HCV chimeras, plaque 
phenotypes, reticulocyte translation efficiencies relative to parental BVDV, specific 
infectivities in MDBK cells, titers at 24 and 48 h post-transfection (or 72 h, as indicated), and 
an indication of whether pseudorevertants arose with results fi-om BVDV, 5*HCV, 
5 BVDV+HCV, and BVDV+HCVdelB3 chimeras shown in Fig. 2A and results from 
BVDV+HCVdelB2B3, BVDV+HCVdelBlB2B3, BVDV+HCVdelB2B3Hl, and 
BVDV+HCVdelB2B3HlH2 shown in Fig. 2B, where N.D. means not determined. 

Figure 3 illustrates the in vitro translation efficiency of BVDV RNA or chimeras 
showing bar graphs of the amount of ISP"*, the N-terminal protein in the BVDV ORF, 
1 0 expressed by the various constructs. 

Figure 4 illustrates a schematic representation of EMCV chimeras, plaque 
phenotypes, reticulocyte translation efficiencies relative to parental BVDV, specific 
infectivities in MDBK cells, titers at 24 and 48 h post-transfection (or 72 h, as indicated), and 
an indication of whether pseudorevertants arose. 
15 Figure 5 illustrates a pseudorevertant analyses showing in (Fig. 5 A) the relative 

positions of mutations detected within the plaque-purified variants of passaged 
BVDV+HCVdelBlB2B3, 5'EMCV, and S'HCV, and in (Fig. 5B) the 5* terminal sequences of 
pseudorevertants of BVDV+HCVdelBlB2B3, 5'EMCV, and 5THCV. Novel nucleotides or 
sequences are shown in bold upper case type. Pseudorevertants are numbered and designated 
20 by the suffix ".R". The upper case sequence in B VDV+HCVdelB 1B2B3 and 

BVDV+HCVdelBlB2B3.Rl is a remnant of downstream BVDV 5* NTR sequences and was 
created during the cloning procedures. 

Figure 6 illustrates the construction of derivatives of 5'HCV designed to contain 5* 
termini corresponding to the sequence detected within the three analyzed pseudorevertants. 
25 Fig. 6A shows the 5' terminal sequence of the 5*HCV derivatives with the suffix (orig) 

designating a derivative containing the orig inal 5' terminal sequence of the pseudorevertant; 
the suffix (cons) designating a derivative containing the cons ensus tetranucleotide sequence 
5'-GUAU at the same position; and novel sequences shown in bold upper case type. Fig. 6B 
shows plaque phenotypes, reticulocyte translation efficiencies relative to parental BVDV, 
30 specific infectivities in MDBK cells, and titers at 24 and 48 h post-transfection are indicated. 
Figure 7 illustrates a single step growth curve for various chimeric constructs 
showing released virus titers measured by performing plaque assays on MDBK cells 
transfected with various constructs. 

Figure 8 illustrates replication of BVDV RNA or chimeric derivatives in transfected 

35 MDBK cells. Equal numbers of MDBK cells (~ 8 x 10^) were electroporated with 5 Dg of 
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each in vitro synthesized RNA. MDBK cells were also transfected with infectious yellow 
fever 17D and Sindbis RNAs to provide molecular mass markers. One fifth of the transfected 
cells were seeded on 35-mm dishes and incubated in D-MEM supplemented with 10% horse 
serum for 6 h at H^C. The media were then replaced with 1 ml of fresh media containing 2 

5 g/ml of actinomycin D and 40 Ci/ml of ^H-uridine. Incubations were continued for 1 0 h at 
37^C. RNAs were isolated as described in Materials and Methods, and 1/4 of the samples 
was denatured in glyoxal and loaded on an agarose gel. (A) Autoradiograph of the dried gel. 
Only the portion of the gel containing the genomic RNAs is shown. (B) Amount of 
radioactivity contained within the displayed fragments as determined by scintillation 

10 counting. BVDV, lane 1; 5'HCV, lane 2; BVDV+HCVdelB2B3, lane 3; 

BVDV+HCVdelB2B3Hl, lane 4; 5'HCV.Rlorig, lane 5; 5'HCV.Rlcons, lane 6; 
5»HCV.R3orig, lane 7; 5*HCV.R3cons, lane 8; 5'HCV.R2orig, lane 9; 5*HCV.R2cons, lane 10; 
yellow fever 17D, lane 11; Sindbis, lane 12; non-transfected MDBK cells, lane 13. The 
experiments shown is one of two repetitions which yielded similar results. 

1 5 Figure 9 illustrates the genetic map of plasmid pACNR/BUD. 

Figure 10 illustrates the sequence of low copy number plasmid pACNR/BVDV 
NADL (circular) harboring the functional cDNA of cytopathic BVDV NADL (positive sense 
cDNA 5' to 3*; nt 1-12578. 

Figure 1 1 illustrates the sequence of infectious BVDV NADL (positive sense cDNA 

20 5' to 3'). 

Figure 12 illustrates the seqxience of infectious non-cytopathic BVDV NADL lacking 
cins (positive sense cDNA 5' to 3*). 

Figure 13 illustrates the sequence adapted HCV 5* NTR from 5'HCV/Rl.cons 
(positive sense cDNA 5' to 3'; only the sequence from the 5' base to the ATG initiating the 
25 polyprotein is shown). 

Figure 14 illustrates the sequence of adapted HCV 5' NTR from 5*HCV/RLorig 
(positive sense cDNA 5* to 3*; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 

Figure 15 illustrates the sequence of adapted HCV SVtTR from 5'HCV/R2.cons 
30 (positive sense cDNA 5* to 3*; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 

Figure 16 illustrates the sequence of adapted HCV 5' NTR from 5'HCV/R2.orig 
(positive sense cNDA 5* to 3'; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 
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Figure 17 illustrates the sequence of adapted HCV 5' NTR from 5'HCV/R3.cons 
(positive sense cDNA 5' to 3*; only the sequence from the 5*base to the ATG initiating the 
polyprotein is shown). 

Figure 18 illustrates the sequence of adapted HCV 5*NTR from 5*HCV/R3.orig 
5 (positive sense cDNA 5* to 3*; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 

Figure 19 illustrates the sequence of prototype HCV-BVDV chimera from 
pNADL/5'HR3.orig/3'H3*B with the adapted HCV 5'NTR from 5*HCV/R3.orig and tandem 3' 
NTR elements from HCV followed by BVDV (positive sense cDNA 5' to 30 as discussed in 
10 Example 5. 

Figure 20 illustrates various deletions of the poly U track in the i'KTR HCV 
sequence of BVDV/HCV chimera p5H-3H33. 

Figure 21illustrates the schematic representation of functional HCV/-BVDV chimera 
from pCBV/p7. 

15 Figure 22 illustrates the sequence of functional HCV-BVDV chimera from pCBV/p7 

(positive sense cDNA 5* to 3'). 

Figure 23 illustrates the schematic representation of a HCV/BVDV chimera with 
selectable marker. 

Figure 24 illustrates the sequence of functional HCV-BVDV chimera from 
20 pCB V/p7/IRES-pac expressing a dominant selectable marker conferring resistance to 
puromycin (positive sense cDNA 5' to 3*)- 

Figure 25 illustrates the schematic representation of a bicistronic HCV/BVDV 
chimera. 

Figure 26 illustrates the sequence of functional bicistronic chimera expressing the 
25 entire HCV structural region derived from plasmid pNADL/BI#41/HCV str (positive sense 
cDNA5*to3') 

Description of the Preferred Embodiments 

In accordance with the present invention, the inventors herein have succeeded in 

30 generating HCV-BVDV chimeric RNAs which are replication competent. Such chimeras are 
useful in screening compounds in vitro for antiviral activity against HCV. In addition, it is 
believed that in vivo replication of HCV-BVDV chimeras according to the invention may be 
attenuated as compared to wild-type BVDV and thus may be useful in vaccinating animals 
against BVDV. It is also believed that the HCV chimeric structures described herein for 

35 BVDV are applicable to other pestiviruses. 
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In the context of this disclosure, the following terms will be defined as follows unless 
otherwise indicated: 

"Cis-acting sequences" means the nucleotide sequences from an RNA virus genome 
that are necessary for recognition of the genomic RNA by specific protein(s) of the RNA 
5 virus or host cell that carry out replication, transcription, translation or packaging of the 
genome. 

"Genetically-engineered virus" means any virus whose genome is different than that 
of a wild-type virus due to a human-made deletion, insertion, or substitution of one or more 
nucleotides to the wild-type viral genome. 
10 "Infectious" when used to describe a virus means the virus is capable of entering cells 

and initiating a virus replication cycle, whether or not this leads to the production of new 
RNA virus particles. 

"Nucleotide sequence" as used herein refers to DNA and the corresponding RNA 
sequence where relevant It will be understood that sequences shown in the Figures are DNA 
1 5 versions of the RNA sequence and that chimeric molecules of the invention may comprises 
RNA molecules or cDNA copies of such RNA molecules. 

"Replication-competent" as applied to a chimeric HCV-pestivirus RNA means the 
RNA is capable of RNA-dependent replication in at least one cell type that supports 
replication of the wild-type parental pestivirus. The number of replicated RNA molecules 
20 produced by an HCV-pestivirus chimeric RNA of the invention is at least 10-fold higher than 
the limit of detection, which is typically 10 to 100 molecules. More preferably, chimeric 
RNA production by the HCV-pestivirus chimeric RNA is at least 10^ to lO'-fold higher than 
the detection limit. The replication-competent chimeric RNA replicates at an efficiency that 
is preferably, at least 0.001%, more preferably, at least 0.01%, more preferably, at least 0.1%, 
25 more preferably, at least 1%, more preferably at least 10% and most preferably at least 50% 
up to 90% that of the parental pestivirus in the same cell type. 

"Transfected cell" means a cell containing an exogenously introduced nucleic acid 
molecule, and includes cells that are transiently transfected with the exogenous nucleic acid. 
"Transformed cell" or "stably transformed cell" means a cell containing an 
30 exogenously introduced nucleic acid molecule which is present in the cytoplasm or nucleus of 
the cell and may be stably integrated into the chromosomal DNA of the cell. 
"Virus" means a virion, virus particle or a viral genome. 

A chimeric viral RNA according to the invention is designed to comprise a 5' NTR, 
an ORF, and a 3 ' NTR, at least one of which is a chimeric region containing two operably 
35 linked nucleotide sequences that are from the same region of a pestivirus and an HCV. 
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Pestivirus-specific sequences useful in the invention can be taken from the appropriate 
genomic region of any cytopathic or noncytopathic type I or type n BVDV isolate, classical 
swine fever virus (CSFV) isolate, or border disease viral isolate. For a list of pestiviruses , 
see Thiel, H.-J., P. G. W. Plagemann, and V. Moennig. 1996. Pestiviruses, p. 1059-1073. In 
5 B.N. Fields, D. M. Knipe and P. M. Howley (ed.), Fields Virology. Raven Press, New York. 
HCV-specific sequences can be taken from any strain or isolate of HCV, including but not 
limited to HCV-1, HCV-la, HCV-lb, HCV-lc, HCV-2a, HCV-Ib, HCV-2c, HCV-3a . 
Preferably, the parental pestivirus is a cytopathic strain of BVDV and the parental HCV strain 
is HCV-1. 

10 The pestivirus- and HCV-specific sequences are operably linked in the chimeric 

region, meaning the sequences are arranged such that the resulting chimeric structure is 
functional in the context of replication of the pestivirus. For example, in one preferred 
embodiment the chimeric viral RNA comprises a chimeric S' NTR which comprises a 
BVDV-specific 5' terminal sequence of 5'-(G/A)UAU and an IRES derived from HCV, with 

1 5 the ORF and the 3 ' NTR consisting of a sequence from the same regions of BVDV. The 

BVDV-specific sequences at the 5' terminus and in the ORF and 3' NTR are chosen such that 
they are functional in the context of BVDV, meaning the chimeric viral RNA expresses the 
replication machinery of BVDV and this replication machinery is capable of replicating the 
chimeric RNA. In addition, translation of the BVDV ORF in the chimeric viral RNA is 

20 dependent upon a functional HCV IRES. The presence of a functional HCV IRES in this 

chimera allows the chimera to be used to screen for compounds that target the HCV IRES and 
thereby inhibit translation of the BVDV ORF as well as replication of the chimeric virus. 
Such compounds would be expected to also inhibit translation of the ORF in a wild-type HCV 
and consequently inhibit HCV replication. 

25 Compounds that could be screened for anti-HCV activity using this and other HCV- 

BVDV 5' NTR chimeras include but are not limited to antisense RNAs, RNA decoys that 
bind proteins involved in recognition of the HCV-specific sequences, ribozymes, and small 
molecule inhibitors of critical RNA-protein interactions. The use of such substances for 
therapeutic applications are known in the art. See, e.g., Amarzguioui M, et al., "Hammerhead 

30 ribozyme design and application," Cell Mol Life Sci. 1998 Nov;54(l 1):1 175-202; Welch PJ, 
et al., "Expression of ribozymes in gene transfer systems to modulate target RNA levels.", 
Curr Opin Biotechnol 1998 Oct;9(5):486-96; Bramlage B, et al. "Designing ribozymes for 
the inhibition of gene expression."; Trends Biotechnol 1998 Oct;16(10):434-8; Gewirtz AM, 
et al. "Nucleic acid therapeutics: state of the art and future prospects."; Blood, 1998 Aug 

35 1;92(3):7 12-36; Altman S., "RNase P in research and therapy." Biotechnology (N Y). 1995 
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Apr;13(4):327-9; Flanagan WM., "Antisense comes of age/*; Cancer Metastasis Rev. 1998 
Jun; 17(2): 169-76; Agrawal S. et al., "Antisense therapeutics." Curr Opin Chem BioL 1998 
Aug;2(4):5 19-28; Caselmann WH, et al, "Synthetic antisense ohgodeoxynucleotides as 
potential drugs against hepatitis C." Intervirology 1997;40(5-6):394-9; Neckers LM., 
5 "Oligodeoxynucleotide inhibitors of function: mRNA and protein interactions." Cancer J Sci 
Am. 1998 May;4 Suppl l:S35-42; Agrawal S, et al "Mixed backbone oligonucleotides: 
improvement in oligonucleotide-induced toxicity in vivo." Antisense Nucleic Acid Drug Dev. 
1998 Apr;8(2): 135-9; Crooke ST., "An overview of progress in antisense therapeutics." 
Antisense Nucleic Acid Drug Dev, 1998 Apr;8(2):l 15-22; Fraisier C, et al, "High level 

1 0 inhibition of HIV replication with combination RNA decoys expressed from an HTV-Tat 
inducible vector."; Gene Ther. 1998 Dec;5( 12): 1665-76; Gervaix A, et al. "Gene therapy 
targeting peripheral blood CD34+ hematopoietic stem cells of HIV-infected individuals.*' 
Hum Gene Titer. 1997 Dec 10;8(18):2229-38; Nakaya T, et al "Inhibition of HIV-1 
replication by targeting the Rev protein." Leukemia 1997 Apr;l 1 Suppl 3:134-7; Nakaya T, et 

1 5 al "Decoy approach using RNA-DNA chimera oligonucleotides to inhibit the regulatory 
function of human immunodeficiency virus type 1 Rev protein." Antimicrob Agents 
Chemother, 1997 Fcb;41(2):3 19-25; Smith C, et al "Transient protection of human T-cells 
from human immunodeficiency virus type 1 infection by transduction with adeno-associated 
viral vectors which express RNA decoys." Antiviral Res. 1996 Oct;32(2):99-115; Bahner I, et 

20 al "Transduction of human CD34+ hematopoietic progenitor cells by a retroviral vector 
expressing an RRE decoy inhibits human immunodeficiency virus type 1 replication in 
myelomonocytic cells produced in long-term culture." J Virol. 1996 Jul;70(7):4352-60; Lee 
SW, et al "Inhibition of human immunodeficiency virus type 1 in human T cells by a potent 
Rev response element decoy consisting of the 13-nucleotide minimal Rev-binding domain." J 

25 Virol. 1994 Dec;68(12):8254-64; Lisziewicz J, et al "Inhibition of human immunodeficiency 
virus type 1 replication by regulated expression of a polymeric Tat activation response RNA 
decoy as a strategy for gene therapy in AIDS." Proc Natl Acad Sci USA. 1993 Sep 
l;90(17):8000-4; Bevec D, et al. "Inhibition of human immunodeficiency virus type 1 
replication in human T cells by retroviral-mediated gene transfer of a dominant-negative Rev 

30 trans-activator." Proc Natl Acad Sci USA. 1992 Oct 15;89(20):9870-4. 

It is contemplated that a number of replication-competent chimeric structures can be 
made that allow the function of various HCV sequence elements and proteins to be studied 
and targeted in drug screening assays. For example, the invention includes replication- 
competent HCV-pesti virus chimeras having a chimeric ORF. One such chimeric ORF is one 

35 comprising an HCV sequence encoding the structural proteins and a pestivirus sequence 
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encoding the nonstructural proteins. It is believed that upon introduction into a cell, such a 
HCV-BVDV ORF chimera will produce HCV-like virus particles that will be released from 
the cell and capable of infecting cells normally infected by wild-type HCV, i.e., cells 
expressing an HCV receptor such as human CDS 1 . Such ORF chimeras would be useful to 
5 screen compounds for drugs that inhibit formation, release or entry of HCV particles. In 
addition, ORF chimeras that produce virus particles containing at least one HCV structural 
protein would be useful as vaccines against HCV. Other ORF chimeras contemplated by the 
invention include, for example, chimeras comprising a pestivirus sequence encoding 
structural proteins and an HCV sequence encoding one or more nonstructural proteins such as 

1 0 the NS3 protease, NS4A cofactor, NS5 A phosphoprotein/interferon resistance determinant 
and/or the NS5B polymerase. Replication of such ORF chimeras would be dependent upon 
the function of the HCV nonstructural protein(s) and these ORF chimeras could be used to 
screen for drugs that target the HCV nonstructural protein(s) as well as to screen for and map 
potential drug resistance mutations in HCV nonstructural proteins. In addition, HCV- 

IS pestivirus ORF chimeras could be useful for developing alternative in vivo animal models for 
HCV replication and HCV-associated hepatocellular carcinoma to evaluate antivirals and 
anti-tumor agents. 

The invention also provides replication-competent HCV-pestivirus chimeras having a 
chimeric 3 ' NTR which contains one or more conserved elements of tiie HCV 3 ' NTR. Such 

20 3 ' NTR chimeras would be useful for screening or evaluating compounds targeted against the 
HCV 3' NTR. Compounds that could be screened include antisense RNA molecules, 
ribozymes and small molecule inhibitors of critical RNA-protein interactions. One 3"^ NTR 
chimera according to the invention comprises a BVDV S ' NTR, BVDV ORF and a chimeric 
3' NTR which consists of an HCV-specific sequence d^ved from the HCV 3' NTR 

25 immediately followed by a BVDV 3 ' NTR. The HCV-specific 3 ' NTR that allows for 

replication in the context of BVDV has a deletion in the 3 ' NTR poly (U) tract but has all the 
other HCV 3 ' NTR elements, including the 98 bp 3 ' terminal conserved element. 

HCV-pestivirus chimeras included within the scope of the invention include those 
comprising combinations of chimeric regions, i.e., 5' NTR and ORF chimeras; 5' NTR and 3' 

30 NTR chimeras; ORF and 3' NTR chimeras; and chimeric RNAs in which each of the 5 ' NTR, 
ORF and 3' NTR regions comprise an HCV sequence operably linked to a pestivirus 
sequence. 

The invention also provides chimeric RNAs having two ORFs, or bicistronic HCV- 
pestivirus chimeras. Bicistronic chimeras contemplated by the invention include structures in 
35 which the first ORF contains one or more HCV genes and is followed by a second IRES 
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operably linked to a second ORF encoding the pestivirus replicase machinery. It is also 
contemplated the first ORF may encode a heterologous sequence such as an antigen. 

It is believed that many HCV-pestivirus chimeras of the invention will be attenuated 
as compared to the parental wild-type pestivirus. Such attenuated chimeric RNA genomes 
5 would be candidate vaccines in the form of live-attenuated virus particles or as RNA or 
cDNA "genetic" vaccines. 

The invention also includes vaccines against HCV which comprise an 
immunogenically-effective amount of HCV-pestivirus particles or nucleic acid. Anti-HCV 
vaccines comprising virus particles should preferably contain one or more HCV structural 
10 proteins. 

The therapeutic or pharmaceutical compositions of the present invention can be 
administered by any suitable route known in the art including for example by injection such 
as intraperitoneal, intravenous, subcutaneous, intramuscular, transdermal, intrathecal or 
intracerebral injection. Administration can be either rapid as by injection or over a period of 

1 S time as by slow infusion or administration of slow release formulation. 

Compositions according to the invention can be employed in the form of 
pharmaceutical or veterinary preparations. Such preparations are made in a manner well 
known in the pharmaceutical and veterinary arts. One preferred preparation utilizes a vehicle 
of physiological saline solution, but it is contemplated that other pharmaceutically acceptable 

20 carriers such as physiological concentrations of other non-toxic salts, five percent aqueous 
glucose solution, sterile water or the like may also be used. It may also be desirable that a 
suitable buffer be present in the composition. Such solutions can, if desired, be lyophilized 
and stored in a sterile ampoule ready for reconstitution by the addition of sterile water for 
ready injection. The primary solvent can be aqueotis or alternatively non-aqueous. 

25 The carrier can also contain other pharmaceutically-acceptable excipients for 

modifying or maintaining the pH, osmolarity, viscosity, clarity, color, sterility, stability, rate 
of dissolution, or odor of the formulation. Similarly, the carrier may contain still other 
pharmaceutically-acceptable excipients for modifying or maintaining release or absorption or 
penetration across the blood-brain barrier. Such excipients are those substances usually and 

30 customarily employed to formulate dosages for parenteral administration in either unit dosage 
or multi-dose form or for direct infusion into the cerebrospinal fluid by continuous or periodic 
infusion. 

It is also contemplated that certain formulations containing a chimeric virus according 
to the invention are to be administered orally. Such formulations are preferably encapsulated 
35 and formulated with suitable carriers in solid dosage forms. Some examples of suitable 
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carriers, excipients, and diluents include lactose, dextrose, sucrose, sorbitol, mannitol, 
starches, gum acacia, calcium phosphate, alginates, calcium silicate, microcrystalline 
cellulose, polyvinylpyrrolidone, cellulose, gelatin, syrup, methyl cellulose, methyl- and 
propylhydroxybenzoates, talc, magnesium, stearate, water, mineral oil, and the like. The 
5 formulations can additionally include lubricating agents, wetting agents, emulsifying and 
suspending agents, preserving agents, sweetening agents or flavoring agents. The 
compositions may be formulated so as to provide rapid, sustained, or delayed release of the 
active ingredients after administration to the patient by employing procedures well known in 
the art. The formulations can also contain substances that diminish proteolytic degradation 

10 and promote absorption such as, for example, surface active agents. 

The specific dose is calculated according to the approximate body weight or body 
surface area of the patient or the volume of body space to be occupied. The dose will also be 
calculated dependent upon the particular route of administration selected. Such calculations 
can be made without undue experimentation by one skilled in the art. Exact dosages are 

1 5 determined in conjunction with standard dose-response studies. It will be understood that the 
amount of the composition actually administered will be determined by a practitioner, in the 
ligiht of the relevant circumstances including the condition or conditions to be treated, the 
choice of composition to be administered, the age, weight, and response of the individual 
patient, the severity of the patient's symptoms, and the chosen route of administration. Dose 

20 administration can be repeated depending upon the pharmacokinetic parameters of the dosage 
formulation and the route of administration used. 

Replication-competent HCV-pestiviruses are generated by choosing the HCV 
function or sequence element desired to be studied. The HCV sequence can be obtained from 
a plasmid clone of a partial or full HCV genome using PCR to amplify a target region 

25 containing the desired sequence or by restriction enzyme digestion. The HCV fragment is 
then inserted into the desired location of a clone of the pestivirus genome using standard 
techniques. Desired portions of the pestivirus genome may be deleted before or after addition 
of the HCV fiagment. The recombinant genome is then transfected into a cell that supports 
replication of the parental pestivirus genome and their ability to replicate using standard 

30 assays. For example, replication can be assessed by virus-induced cytopathic effect; plaque 
formation; detection of viral antigens and/or viral RNA accumulation; and by plaque assay 
measuring released infectious virus. The inventors herein have found that the BVDV RNA 
replication machinery works in many cell types, including bovine, hamster, mouse and human 
cells. It has also been reported that BVDV RNAs can amplify in other cell types including 

35 human hepatoma lines and hepatocytes (Behrens SE, et al., J Virol 1998 Mar;72(3):2364-72). 
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The host cell range for a particular chimera will be dependent upon the properties of that 
chimera as empirically determined. 

As described below, some chimeras do not replicate stably as indicated by 
heterogeneity in the size of plaques produced by the chimeric virus. Upon passage, 
5 pseudorevertants can frequently be isolated that are capable of stable replication. Such 
pseudorevertants will have one or more deletions or base substitutions in the HCV and/or 
pestivirus sequences. Information derived from these gain-of-function mutations can be used 
to define the elements necessary for generating stable, replication-competent chimeras of 
HCV and a pestivirus. 

10 The invention provides a method for screening compounds for antiviral activity 

against HCV. The method involves comparing a test compound's effect on replication of a 
chimeric HCV-pestivirus RNA molecule as described above with the compoimd*s effect on 
replication of the parental pestivirus. Compounds which have a greater effect on replication 
of the chimeric virus than the pestivirus are likely directed against the HCV portion of the 

IS chimera. Typically, the method is performed by providing duplicate cell cultures containing a 
chimeric viral RNA which is replication-competent in that cell, treating one of the culture 
with the test compound, and then measuring the replication efficiency of the chimeric RNA in 
both cultures. Any effect induced by the compound is compared against the compound's 
effect on replication of the parental pestivirus in cells of the same type. This control assay is 

20 preferably performed at the same time using the same culture conditions. 

The cells used in the screening assay can be prepared by transiently transfecting the 
cells with the desired chimeric RNA molecule as described below. Alternatively, it is 
contemplated that the chimeric RNA molecule can be constitutively expressed in the cell by 
transfecting the cell with a polynucleotide comprising a cDNA of the chimeric RNA operably 

25 linked to a DNA-dependent promoter. The chimeric cDNA may include a selectable marker, 
which would allow for selection of cells expressing the chimeric RNA. It is also envisioned 
the selectable marker could be a dominant marker that allows selection of cells expressing 
chimeras having adaptive mutations or selection of cells permissive for virus replication 
(Frolov et al., / Virol. 73:3854-3865, 1999). It is also contemplated the cDNA could express 

30 a reporter gene that could be assayed to measure RNA replication. 

Alternatively, chimeric virus particles are incubated with a cell permissive for 
infection by the pestivirus in the presence or absence of the test compound and then 
replication of the chimeric virus is measured and compared to the replication of the parental 
pestivirus incubated with the same cell type in the presence or absence of the test compound. 
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Inhibition of replication can be measured in many ways, including assaying for the 
reduction of virus-induced cytopathic effect; inhibition of plaque formation, reduced 
production of viral antigens as detected by inununofluoresence assay; reduced viral RNA 
accumulation; reduction in released infectious virus from treated and untreated control and 
5 chimera samples using a plaque assay. In addition, it is contemplated that a cell line that is 
designed for pestivirus-specific transactivation of a reporter gene could be used directly or in 
lieu of a plaque assay. The reporter gene is operably linked to a promoter that is activated 
upon infection by the chimeric virus and production of the viral transactivator protein. 

Preferred embodiments of the invention are described in the following examples. 
10 Other embodiments within the scope of the claims herein will be apparent to one skilled in the 
art from consideration of the specification or practice of the invention as disclosed herein. It 
is intended that the specification, together with the examples, be considered exemplary only, 
with the scope and spirit of the invention being indicated by the claims which follow the 
examples. 

IS Example 1 

This example illustrates the construction and analysis of 5' HCV-BVDV chimeras as 
reported in detail in Frolov et al. {RNA 4:1418-1435, 1998) which is incorporated in its 
entirety by reference. A functional clone of BVDV (Mendez et al., 7. Virol 72:4737-4745, 
1998) was used to construct and characterize a series of 5* NTR chimeras with sequences 
20 derived from HCV and the picomavirus, encephalomyocarditis virus (EMCV), The results 
help to define the requirements of a frinctional BVDV 5* NTR and provide replication- 
competent BVDV-HCV chimeras dependent on a functional HCV IRES. 

Example 2 

This example illustrates the construction of chimeras for expressing additional 
25 functional portions of the HCV genome by addition of further HCV sequence downstream 
from the functional or adapted HCV S^NTR chimeras fiised in-frame to the BVDV ORF. 

One such construct (Figure 21) involves fusion of HCV sequences to BVDV 
sequences in the p7 protein coding region (at a convenient BseRI restriction site). Both HCV 
and BVDV encode a p7 protein that is located immediately downstream of the E2 protein. 
30 The p7 protein is a small hydrophobic protein of unknown fiinction. pCBV/p7 consists of the 
first 79 bases of the BVDV S'NTR encoding stem loop structure BT and Bl, followed by the 
entire HCV S'NTR, the entire HCV structural protein coding region and the first 36 amino 
acids of HCV p7 fused to the C-terminal 31 amino acids of BVDV p7. The fused p7 gene is 
followed by the remainder of the BVDV ORF including the entire nonstructural region and 
35 the BVDV 3' NTR. Transfection of MDBK cells with the RNA corresponding to this 
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sequence (Figure 22) leads to replication of the chimeric RNA and production of the expected 
HCV and BVDV polyprotein cleavage products. Variations on this strategy are envisioned in 
which all or part of the HCV polyprotein and cis elements important for RNA packaging can 
be expressed in viable chimeras. In addition the BVDV replicase regions for either cytopathic 
5 or non-cytopathic pestiviruses (like NADL cins-) can be used. Transfection of cells 

permissive for HCV particle, assembly, release and reinfection with this chimeric RNA can 
be used to make HCV-like particles. These particles and this infection system can be used (i) 
to screen for specific inhibitors of HCV particle, assembly, release and reinfection, (ii) for 
identifying antibodies capable of neutraUzing HCV infectivity and (iii) as live or inactivated 
10 vaccines. Furthermore, this embodiment of the invention demonstrates that the BVDV RNA 
replication machinery can be used for expression of heterologous RNA and polypeptide 
sequences and can be used as a vehicle for RNA or DNA "genetic" vaccination in which the 
BVDV replicase amplifies the level of antigen expression by cytoplasmic RNA-dependent 
replication. 

15 

Example 3 

This example illustrates chimeric RNA*s that are modified to express dominant 
selectable markers, assayable markers or FAGS sortable markers. 

Such variants can be used to select for chimeras capable of replication in particular 

20 cell types, or to screen for cell types that are permissive for replication of the chimeric RNA. 
Selectable markers include, but are not limited to, the genes encoding puromycin resistance 
(puromycin N-acetyl transferase; PAC), neomycin resistance, blasticidin resistance, 
hygromycin resistance, etc. Assayable markers include, but are not limited to, the genes 
encoding B-galactosidase, luciferase, B-glucuronidase, etc. Easily sortable molecules include 

25 single chain antibodies, cell surface markers, and non-toxic protein markers like green 
fluorescent protein. In a specific example (Figures 23 and 24), the KNA encoded by 
pCBV/p7 was modified to include a cassette at the beginning of the BVDV that is 

comprised of the EMCV IRES driving the gene encoding PAC. This chimeric RNA can 
replicate, expresses PAC and confers resistance to puromycin resistance. This property can 

30 be used to select for variants of the chimera that are capable of noncytopathic replication in 
desired cells type and also provides a means of showing that cells harbor a functional 
chimeric RNA. Desired variants can be identified, cloned and further characterized as 
described in Example 1. Of note, is that this location in the BVDV genome and this strategy 
for expressing heterologous genes may also be applied to using infectious attenuated 
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pestiviruses as gene expression vectors and as chimeric live vaccines against other animal 
pathogens. 

Example 4 

5 

This example illustrates the use of the bicistronic strategy as an alternative to the in- 
frame fusions described in Example 2. 

A specific example is shown in Figure 25 and its sequence as Figure 26. In this 
bicistronic chimera, the 5* sequences are identical to that of pCBV/p7 except that the HCV 

10 ORF continues to include the first 246 amino acids of NS4B. The HCV sequence is followed 
by the EMCV IRES fused to BVDV Npro, the N-terminal 10 aa of BVDV C, the C-terminal 
19 aa of C, 9 N-terminal amino acids of Ems, 48 C-terminal amino acids of E2 and the 
remainder of the BVDV NADL ORF and 3' NTR. The constructed BVDV ORF encodes a 
functional BVDV RNA replicase. The deletions in flie N-terminal portion of this ORF were 

1 5 designed to preserve proper membrane topology and processing of the replicase. The 

bicistronic chimeric RNA can replicate upon transfection of permissive BVDV host cells. 

Examples 

20 This example illustrates 3'NTR chimeras. Although initial attempts to recover viable 

chimeric viruses in which the BVDV 3'NTR was completely replaced by that of HCV were 
unsuccessful, a strategy similar to that detailed in Example 1 has produced chimeras that 
harbor the conserved elements of the HCV 3'NTR. An initial tandem 3'NTR construct was 
made in which the HCV 3'NTR was engineered to follow the BVDV ORF. The complete 

25 BVDV 3'NTR was position 3' to the HCV 3' NTR after a short heterologous sequence. This 
sequence of this parental construct, which replicated poorly, is shown in Figure 19 RNAs 
transcribed from this plasmid were of low specific infectivity suggesting that revertants or 
pseudorevertants might have arisen. Indeed isolation and sequence analysis of several 
independent plaque-forming variants revealed that deletions in the HCV poly U tract of 

30 various lengths had occurred. These revertant sequences are shown in Figure 20. When these 
altered HCV 3'NTRs were reconstituted into the original tandem 3' NTR parent, they gave 
rise to plaque forming RNA transcripts of high specific infectivity, demonstrating that these 
alterations restored the ability of the chimeric RNA to replicate. Large deletions in the U tract 
gave rise to virus with more robust replication and larger plaques while stably maintaining the 

35 conserved HCV 3*NTR 98-base element and the polypyrimidine "transition" region. Such 
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chimeric viruses can now be used to screen and evaluate antisense, ribozyme, and other 
therapeutics targeted against this conserved HCV RNA element that is essential for 
repUcation. 

5 Materials and Methods 

Plasmid Constructs 

pACNR/BVDV NADL was previously described (Mendez et al., 1998, supra). 
pBVDV is a derivative of pACNR/BVDV NADL which contains a G->T transversion at nt 
14994 that creates rnXba I site upstream of the T7 promoter (T. Myers & CM. Rice, 

10 unpubl.). To facilitate construction of the chimeras, subclones were created. First, two 
fragments were isolated by PCR amplification of p90/HCVFLIongpU (Kolykhalov et al.. 
Science 277:570-574, 1997) with primers #498 (5»-TGTACATGGCACGTGCCAGCCCC) 
and #498 (5'-GATCAACTCCATGGTGCACGGTCT) and pBVDV with primers #481 (5*- 
AGACCGTGCACCATGGAGTTGATC) and #482 (5*- 

1 5 CGTTTCACACATGGATCCCTCCTC), These two fragments were digested with ApdL I 
and hgated to produce a fragment containing a fusion of the HCV 5' NTR to the BVDV ORF. 
This fragment was digested with Sad and ligated into pGEM3Zf(-) which had been digested 
with Sma I and Sac I to produce the subclone pGEM498-Sacl. Next, a fragment containing 
the BVDV 5' NTR was synthesized by PCR amphfication of pBVDV with primers #183 (5'- 

20 TTTTCTAGATAATACGACTCACTATAGTATACGAGAATTAGAAAAGGCACTCG) 
and #480 {5*-GGGGGCTGGCACGTGCCATGTACA). This fragment was digested with 
Xba I and BsrG I and ligated into pGEM498-SacI digested with the same two enzymes, to 
create the plasmid pGEMXbal-Sacl. pGemXbal-Sacl contains a tandem fusion of the BVDV 
5' NTR, the HCV 5' NTR, and the 5' portion of the BVDV N^ gene. pBVDV + HCV was 

25 created by digesting pGEMXbal-SacI with Xba I and Sac I and ligating the fragment into 
pBVDV digested wifli the same two enzymes, and as such pBVDV + HCV contains the T7 
promoter, followed by the entire 385-nt 5' NTR of BVDV, a GT dinucleotide (nt 386-387), 
the entire 341-nt 5' NTR of HCV (nt 388-728), and the sequence of the BVDV NADL strain 
including the ORF and 3* NTR. Derivatives of pBVDV + HCV containing deletions within 

30 the BVDV 5* NTR and/or the HCV 5' NTR were created in the subclone pGEMXbal-Sacl, as 
described below, prior to ligation into Sba I- and Sac I-digested pB VDV. For making 
deletions, restrictions sites with non-compatible protruding ends were treated with the 
Klenow fragment of DNA polymerase I prior to ligation. For creation of pBVDV + 
HCVdelB3 (deletion of nt 174-374, inclusive), pGEMXbal-SacI was digested with 4/7 II and 

35 BsrG 1. For pBVDV + HCVdelB2B3 (deletion of nt 67-374), pGEMXbal-SacI was digested 
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with Avr H and BsrG I. For pBVDV + HCVdelBlB2B3 (deletion of nt 33-374), pGEMXbal- 
Sacl was digested with SnaB I and BsrG I. For pBVDV + HCVdelB2B3Hl (deletion of nt 
67-3396), pGEMXbal-Sacl was digested with Avr U md Xcm I. For pBVDV + 
HCVdelB2B3HlH2 (deletion of nt 67-513), pGEMXbal-Sacl was digested with y4ra II and 
5 Bsg I. For pBVDV + HCVdelB2B3H3 (deletion of nt 67-374, 5 1 8-704), subclone 
pGEMXbal-SacidelB2B3 was digested with Sma I. p5*HCV was created by digesting 
p90/HC VliongpU with Xba IzndNrul and ligating the fragment into pBVDV + HCV 
digested with the same two enzymes. 

The EMCV plasmid, pECg, was provided by Ann Palmenberg and is described 

10 elsewhere (Hahn et ah, J. Virol (5P:2697-2699, 1995). p5'EMCV contains the entire 710 nt of 
the 5' NTR of EMCV, followed by the open reading frame of BVDV and the 3' NTR. One 
extra G residue was added between the T7 promoter and the first nucleotide of the EMCV 5' 
NTR to facilitate efficient in vitro transcription. Convenient restriction sites within the 
BVDV 5' NTR or the EMCV 5* NTR were used to create additional chimeras. Sites with 

15 noncompatible protruding ends were treated with the Klenow fragment of DNA polymerase I 
prior to ligation. For example, the plasmid pBVDV + EMCVdelA contains nt 1-378 of 
BVDV 5' NTR fused with nt 45-710 of EMCV (the BsrG I site of BVDV ligated to the EcoR 
V site of EMCV), pBVDV + EMCVdelB3A contains nt 1-173 of BVDV fused with nt 45-710 
of EMCV (the Afl H site of BVDV ligated to the EcoR V site of EMCV). pBVDV + 

20 EMCVdelB2B3A contains nt 1-66 of BVDV fused with nt 45-710 of EMCV (the Avr II site 
of BVDV ligated to the EcoR V site of EMCV), pBVDV + EMCVdelB3 ABC contains nt 1- 
173 of BVDV fused with nt 161-710 of EMCV (the Afl U site of BVDV ligated to the 
Psp\405 site of EMCV). pBVDV + EMCVdelB2B3ABC nt 1-66 of BVDV fused wiflint 
161-710 of EMCV (the Avr H site of BVDV ligated to the P5pl406 site of EMCV). pBVDV 

25 + EMCVdelB3A-H contains nt 1-101 of BVDV fused with nt 289-710 of EMCV (the Nhe I 
site of BVDV ligated to the .4vr II site of EMCV). pBVDV + EMCVdelB2B3A-H contains 
nt 1-62 of BVDV fused with nt 289-710 of EMCV (flie Avr n site of BVDV ligated to the Avr 
II site of EMCV). The schematics of the chimeric 5' NTRs are presented in Figures 2 and 4. 
All other heterologous 5' NTRs used in the study were generated by PCR using an 

30 oligonucleotide complementary to nt256-272 of the HCV 5' NTR and primers containing the 
sequence of the Xba I restriction site followed by the T7 promoter, the heterologous 
sequences found in sequenced pseudorevertants, or sequences corresponding to different 
regions of the HCV 5* NTR. All the fragments were subcloned into the plasmid, pRS2 (a 
derivative of pUC19), sequenced, and recloned into the pS'HCV plasmid by replacing the 
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fragment between the XBa I site located upstream of the T7 promoter and the Nhe I site (nt 
249-254) in the 5* NTR of HCV. 
Cell cultures 

MDBK cells were obtained from M. CoUett (ViroPharma, Inc.) and BT cells were 
5 obtained from the American Type Culture Collection (Rockville, Maryland). Cells were 

grown in Dulbecco's modified Eagle medium (D-MEM) supplemented with 10% horse serum 
and sodium pyruvate. 
Transcriptions and transfections 

All the designed plasmids , including pBVDV and the chimeric derivatives, were 

1 0 digested to completion with Sda I (&e8387 1), purified by phenol extraction, precipitated by 
ethanol, and dissolved in water. The transcription reactions were performed sin the T7 
Megascript kit (AMBION) using the conditions recommended by the manufacturer. 
Reactions were incubated at 3TC for 1 h, and ^H-UTP was added to the reaction to quantify 
the RNA synthesis. The quality of the synthesized RNAs was checked by agarose gel 

IS electrophoresis, and samples containing 50-60% of full-length RNA were used for 

electroporations and in vitro translations. The reaction mixtures were aliquoted and stored at 
-70^C prior to electroporation or in vitro translations. 

Transfection was performed by electroporation of MDBK cells using previously 
described conditions (Mendez et al., 1998, supra). Two micrograms of in vitro synthesized 

20 RNA, corresponding to approximately 1 g of the full-length transcript, were used per 
electroporation. In standard experiments, ten-fold dilutions of electroporated cells were 
seeded in 6-well tissue culture plates containing 5 x 10^ naive NfDBK cells per well. After 1 
h of incubation at 37°C in an 5% CO2 incubator, cells were overlaid with 3 ml of 0.6% LE 
Sea Kem agarose (FMC Bioproducts) containing minimal essential medium supplemented 

25 with 5% horse serum. Plaques were stained with crystal violet after 3 days incubation at 
37^C. The rest of the transfected cells was seeded into lOO-nun dishes and incubated for 
approximately 48 h or until cytopathic effect was observed in virtually all cells. Samples of 
the media were taken at 24 and 48 h, and virus titers were determined as described above and 
previously (Mendez et al., 1998, supra), 

30 Analysis of the 5' ends of viral genomes 

Sequencing of the 5' ends of selected variants of BVDV was performed on plaque- 
purified viruses. Plaques were typically isolated from the agarose overlay without staining 
with neutral red. Virus was eluted in 1 ml of D-MEM/10% horse serum for several hours and 
was used to infect 5 x 10^ MDBK cells in 3S-mm dishes. After 1 h of virus adsorption of 37 
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^C, an additional 1 ml of D-MEM/10% horse serum was added to the dishes, and incubation 
was continued for 36-48 h until cytopathic effect was observed in virtually all cells. 

Fifty microliters of harvested viral stocks were clarified by low speed centrifiigation, 
and viral RNAs were isolated by TRIzol reagent (Gibco-BRL) using the protocol 
5 recommended by the manufacturer. Sequencing of the 5' termini was performed using an 
oligonucleotide/cDNA-ligation strategy described elsewhere (Troutt et al., Proc. Natl. Acad. 
ScL USA <^P:9823-9825, 1992). The primer SI (5'-GTCGTTTCACACATGGATCC), 
complementary to nt 710-729 of the BVDV genome, was used for cDNA synthesis. A 
phosphorylated oligonucleotide tag (5'-GACTGTTGTGGCCTGCAGGGCCGAATT) with an 

10 amino group on the 3' terminus was ligated to the first strand cDNA (Troutt et al., 1992, 

supra). One tenth ofthis reaction mixture was used for PGR amplification. The primers for 
PGR amplification were as follows: primer A (5'-GCCCTGCAGGCCACAACAGTC), 
complementary to the tag; primer B (5'-TCAGGCAGTACCACAA) complementary to nt 
281-296 of the HCV 5' NTR; and primer C (S'-GGAATGCTCGTCAAGAAGACAG), 

1 5 complementary to nt 268-289 of the EMCV 5' NTR. The primer pairs of A + B or A + C 
were used for analysis of the pseudorevertants of 5'HCV and BVDV + HCVdelBlB2B3 or 
S'EMCV, respectively. For the 5THCV pseudorevertants, one tenth of the ligation mixture 
was used for an additional PGR reaction. This fragment was synthesized using primer S 1 , 
describe above, and a primer corresponding to nt 147-175 of the HCV genome. Fragments 

20 were purified by agarose gel electrophoresis and cloned into the plasmid pRS2. Multiple 
independent clones were sequenced by the standard dideoxy-mediated chain termination 
methods using the Sequenase version 2.0 DNA Sequencing Kit (USB). 
CeH-free translation 

Cell-free translation reactions were performed in reticulocyte extracts (Promega) 

25 using conditions recommended by the manufacture. Usually 0. 1 - 1 fig of the same in vitro 
synthesized RNAs used in transfection experiments were used in 25 ^1 translation reactions. 
After 45 min of incubation at 30 ^C, 2 \i\ were dissolved in 10 |Lil of sample buffer, and those 
samples were analyzed by sodium dodecyl sulfate PAGE. Labeled proteins were visualized 
by autoradiography of the dried gel. The efficiency of translation was measured using 

30 phosphorimager analysis (Molecular Dynamics) by comparing the radioactivity in the band 
corresponding to the protein. In preliminary experiments, an eightfold increase in 
incorporation was observed for translation of 4 ^g versus 0.4 ^g BVDV transcript RNA. 
Quantitative data were obtained from reactions using subsaturating (0.4 ^g) amoimts of 
BVDV or BVDV chimera transcript RNAs. 
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Analysis of vtrns specific RNAs 

The protocols used for radioactive labeling of virus-specific RNAs are described in 
the appropriate figure legends. RNAs were isolated from the cells by using TRIzol reagent as 
recommended by the manufacturer (Gibco-BRL). After denaturation with glyoxal in 
5 dimethylsulfoxide, cellular RNAs were analyzed by electrophoresis in a 1% agarose gel 
containing a 10 mM phosphate buffer. Pieces of the dried gel containing the appropriate 
RNA bands were excised, and their radioactivity measured by liquid scintillation counting. 



Results 

1 0 Features of the BVDV, HCV, and EMCV 5' NTRs important for chimera design 

Schematic representations of the proposed secondary structures of the 5' NTRs of 
HCV, BVDV, and EMCV are shown, and the location of each IRES is indicated in Figure 1. 
EMCV is a member of the cardiovirus genus within the family Picornaviridae, While not a 
member of the Flaviviridae, EMCV is similar to HCV and BVDV in that it is a positive- 

15 strand RNA virus shown to contain an IRES within its 5' NTR (Jang et al, J, viral 62:2636- 
2643, 1988). Based on their proposed secondary structures, the HCV IRES and the BVDV 
IRES have been classified as type 3 IRESs, while the EMCV IRES is classified as a type 2 
IRES (Lemon & Honda, Siemin. ViroL 5:274-288, 1997). However, these three IRESs as 
well as IRESs from other members of the Flaviviridae and the Picornaviridae have been 

20 proposed to contain a common structural core (Le et al.. Virus Genes 72: 135-147, 1996). 

The model for the secondary structure of the 341 -nt HCV 5* NTR has been refined by 
enzymatic and chemical analysis of synthetic transcripts (Brown et al., Ni4cl Acids, Res. 
20:5041-5045, 1992; Wang et al., / Virol 65:7301-7307, 1994; Honda et al., RNA 2:955-968, 
1996; Lima et al., 1997). This element contains four discreet hairpins (referred to here as HI, 

25 H2, H3 and H4) and a pseudoknot at the base of hairpin H3 (Wang et al., 1 995). The 

secondary structure of the 385-nt BVDV 5' NTR has not been as extensively studied, but is 
proposed to be similar to that of HCV (Brown et al., 1992) with four discrete hairpins 
(referred to here as Bl', Bl, B2, and B3) and a pseudoknot at the base of B3 (Rijnbrand et al., 
1997). The secondary structure of the longer (>700 nt) EMCV 5* NTR consists of a series of 

30 hairpins A-M (Duke et al., 1992; Hoffman & Palmenberg, 1996). Recently, a revised model 
of the EMCV 5' NTR suggests moderately different secondary structures for the C and G 
subregions, and significantly different secondary structures for the I-M subregion 
(Palmenberg & Sgro, 1997). 

For HCV, HI is nonessential for IRES function (Reynolds et al., 1995; Rijnbrand et 

35 al., 1995; Honda et al., 1996b; Reynolds et al., 1996; Kamoshita et al., 1997) and its deletion 
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has actually increased translation efficiency in some analyses (Rijnbrand et al., 1995; Honda 
et al., 1996b). Most studies have found that hairpin H2 and H3 and the pseudoknot are 
essential for IRES function (Wang et al., 1993; Rijnbrand et al., 1995; Honda et al, 1996b). 
However, two studies indicate that H2 may not be essential (Tsukiyama-Kohara et al., 1992; 
5 Urabe et al., 1997). The 3' boundary of the HCV IRES is more controversial. The IRES 
clearly extends to the AUG initiation codon. However, some studies indicate that sequences 
affecting the efficiency of translation initiation extend into the ORE (Reynolds et al., 1995; 
Honda et al., 1996a; Honda et al., 1996b; Lu & Wimmer, 1996). By analogy to the HCV 
IRES and the related pestivirus CSFV IRES, the BVDV IRES probably requires hairpins 32 

1 0 and B3 and the pseudoknot for function, with B 1 ' and B 1 probably not required for IRES 
activity (Poole et al, 1995; Rijnbrand et al., 1997). For EMCV, hairpins H-L have been 
shown to be required for IRES function in mono- or dicistronic constructs (Jang & Wimmer, 
1990; Duke et aL, 1992). The remaining portion of the EMCV 5' NTR is thought to be 
required for RNA replication or unknown steps in viral replication that are important for 

15 pathogenesis (Duke et al., 1990; Martin & Palmenberg, 1996). 

Replacement of the BVDV 5* NTR with the HCV 5* NTR results in a large decrease in 
specific infectivity 

Since the BVDV 5* NTR and the HCV 5' NTR are proposed to have similar RNA 
20 secondary structure and functional organization, an experiment was performed to test whether 
the BVDV 5' NTR could be replaced by the HCV 5' NTR. p5' HCV has an exact replacement 
of the BVDV 5' NTR with that of HCV (Fig. 2A) while the coding sequence and 3' NTR of 
pSBCV are identical to pBVDV. Positioning of the HCV 5' NTR in such a manner was 
necessary since translation initiation from the HCV IRES begins at or near the AUG start 
25 codon (Honda et al., 1996a; Reynolds et al., 1995; Reynolds et al., 1996; Rijnbrand et al., 

1996). The specific infectivity of 5'HCV RNA synthesized in vitro was compared to that of 
BVDV RNA by transfection of MDBK (bovine kidney) cells. (Fig. 2A). The specific 
infectivity of BVDV RNA was approximately 4x10*^ plaque forming imits (PFU)/|ug RNA. 
In contrast, the specific infectivity of 5' HCV RNA was near the limit of detection (30-50 
30 PFU/^ig RNA) and considerable plaque heterogeneity was apparent. These results suggested 
that the HCV 5' NTR replacement chimera might be incapable of efficient replication and 
plaque formation and that the plaque forming virus observed had arisen by secondary 
mutation(s). Sequence analysis of plaque-purified 5* HCV viruses presented below confirmed 
that the repUcating pool of virus contained such pseudorevertants. 
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Next, the in vitro translation efficiency of these two RNAs in rabbit reticulocyte 
extracts was analyzed to test whether the defect in specific infectivity of 5* HCV RNA could 
be attributed to lower translation efficiency. Although the specific infectivity of 5* HCV RNA 
was reduced ~5 logs compared to BVDV RNA, its translation efficiency was only slightly 
5 reduced, -twofold (Fig. 3, lane 1 vs. lane 2). The apparent size of the N-terminal cleavage 
product, N*^, was identical for both RNAs, suggesting that translation initiated with the 
correct AUG. These data are consistent with the hypothesis that the BVDV 5' NTR contains 
signals that are required for a step in rephcation other than translation which are not present in 
the 5' HCV chimera. 

10 Given the low specific infectivity of 5' HCV RNA, an experiment was performed to 

test the effect of placing the BVDV 5' NTR sequence upstream of the HCV 5' NTR, resulting 
in tandem BVDV and HCV 5' NTRs (called BVDV + HCV). This arrangement actually 
decreased translation efficiency (Fig. 3, lane 14 vs. lane 1) yet restored infectivity (Fig. 2A). 
The plaques produced by BVDV + HCV were also heterogeneous in size, indicating that this 

1 5 vims was unstable. Upon passage, RT-PCR analysis indicated that pseudorevertants had 

indeed arisen in which portions of the BVDV and/or HCV 5' NTRs had been deleted (data not 
shown). These data show that sequences in the BVDV 5' NTR required for virus replication 
can function when placed upstream of a functional HCV IRES driving translation of the 
BVDV polyprotein. 

20 

Hairpins Bl* and Bl in conjunction witli the HCV IRES are sufficient for stable and 
efficient BVDV replication 

The sequences within the BVDV 5* NTR that restored replication in the context of the 
HCV 5* NTR were mapped using three deletion variants. The deletion BVDV + HCVdelB3 

25 removed a large portion of hairpin B3; the deletion within BVDV + HCVdelB2B3 removed 
hairpins B2 and B3, and the deletion within BVDV + HCVdelBlB2B3 removed hairpins Bl, 
B2 and B3. The specific infectivities of RNAs fi-om these deletion mutants were near that of 
BVDV RNA (Fig. 2). Upon passage of these viruses, RT-PCR analyses and sequencing 
indicated that BVDV + HCV delB3 and BVDV + HCVdelB2B3 were stably propagated and 

30 produced homogeneous plaques slightly smaller than those of wild-type BVDV (data not 
shown). In contrast, BVDV + HCVdelBlB2B3 produced smaller heterogeneous plaques. 
Reverse transcription-polymerase chain reaction (RT-PCR) analysis and sequencing indicated 
that BVDV + HCVdelBlB2B3 underwent a reversion event described in more detail below. 
The translation efficiencies of these three RNAs (Fig. 3, lanes 9, 10, and 12) were similar to 

35 BVDV + HCV RNA (Fig. 3, lane 14), indicating that the deleted portions (hairpins Bl, B2, 



wo 99/55366 PCTAJS99/08850 

25 

and B3) are not required for translation in the BVDV + HCV chimera. These results show 
that Br and Bl are the minimal elements sufficient for stable replication in conjunction with 
theHCVS'NTR. 

Having shown that BT and Bl are sufficient for replication in conjunction with the 
5 HCV 5' NTR, we next conducted a deletion analysis to determine the sequences within the 
HCV 5' NTR of BVDV + HCV delB2B3 required for replication. A large portion of HI was 
deleted in BVDV + HCV delB2B3Hl, while both HI and H2 were deleted in BVDV + HCV 
delB2B3HlH2. Of these two RNAs, only BVDV + HCV delB2B3Hl was as infectious as 
parental BVDV RNA (Fig. 2B). However, the BVDV + HCV delB2B3Hl virus produced 

10 smaller plaques than BVDV + HCV delB2B3, indicating that hairpin HI may augment 
replication of the chimera. In contrast, BVDV + HCV delB2B3HlH2 RNA was not 
infectious (Fig. 2B) and was translated poorly (Fig. 3, lane 1 1). Diminished HCV IRES 
activity might be due to deletion of hairpin H2 or juxtaposition of BVDV hairpins BT and Bl 
with H3. A third derivative of BVDV + HCV delB2B3, with a Sma l-Sma I deletion 

1 5 abrogating HCV IRES function by removing H3, was also not infectious (data not shown). 
Thus, a S' NTR consisting of Bl* and Bl and a fimctional HCV IRES is sufficient for stable 
BVDV replication in MDBK cells. Similar results were obtained in BT cells, another BVDV- 
permissive continuous bovine cell line (data not shown). 

20 Replacement of the BVDV 5' NTR with the EMCV 5' NTR 

The following experiment was performed to determine whether the BVDV 5* NTR 
could be replaced by the 5' NTR of a more phylogenetically distant virus, EMCV. A 
derivative of BVDV was created, called S* EMCV, that contains an exact replacement of the 
BVDV y NTR with the EMCV 5* NTR plus an additional guanosine residue at the 5* terminus 

25 for more efficient transcription initiation of T7 polymerase (Fig. 4A). The specific infectivity 
of 5' EMCV RNA was more than three orders of magnitude lower than BVDV RNA, 
indicating that it was defective for replication, although its specific infectivity was higher than 
that of 5' HCV RNA (compare Figs. 4A and 2A). Similar to 5* HCV. 5' EMCV produced 
heterogeneous plaques, and sequence analysis indicated that pseudorevertants had arisen. The 

30 lower specific infectivity of 5* EMCV RNA was not likely because of a defect in translation, 
since the translation efficiency of 5' EMCV RNA was about threefold higher in vitro than that 
of BVDV RNA (Fig. 3, lane 20 vs. lane 19). 

Similar to BVDV + HCV, it was also determined whether the BVDV 5' NTR at the 5' 
end of the 5' EMCV RNA would increase its specific infectivity. BVDV + EMCVdelA (Fig. 

35 , 4A) contained the entire BVDV 5' NTR in tandem with the EMCV 5' NTR lacking a portion 
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of hairpin A. BVDV + EMCVdelA RNA had a specific infectivity near that of BDVD RNA 
(compare Figs. 4A and 2A) despite having a lower translation efficiency than 5' EMCV (Fig. 
3, lane 21 vs. lane 20). Similar to the results with BVDV + HCV, this implicates the added 
BVDV 5' NTR sequence for a step in viral replication other than translation. Two derivatives 
5 of BVDV + EMCVdelA that contain deletions of portions of the BDVD 5' NTR but maintain 
the sequence of Bl' and Bl, BDVD + EMCVdelB3A and BVDV + EMCVdelB2B3A (Fig. 
4 A), also were infectious. These derivatives had translation efficiencies near that of the 
parental BVDV + EMCVdelA (Fig, 3, compare lanes 15 and 16 with lane 21). This 
demonstrated that hairpins Bl' and Bl were sufficient for replication in conjimction with a 

10 large portion of the EMCV 5' NTR. Derivatives of BVDV + EMCVdelB3A or BVDV + 
EMCVdelB2B3 A that contain further deletions of EMCV (BVDV _ EMCVdelB3 ABC and 
BVDV + EMCVdelB2B3ABC in particular) were translated efficiently (Fig. 3, lanes 17 and 
18) and were infectious (Fig. 4B). This indicates that the chimeras did not require putative 
EMCV RNA replication signals (Martin & Palmenberg, 1996). However, derivatives with 

1 5 deletions extending into tiic canonical EMCV IRES were not infectious. For example, BVDV 
+ EMCVdelB3A-H and BVDV + EMCVdclB2B3A-H, in which a portion of hairpin H is 
deleted, were not infectious (Fig. 4B) and were inefficiently translated in vitro (Fig. 3, lanes 
22 and 23). It should be noted that all of the BVDV + EMCV chimeras produced plaques of 
heterogeneous size, indicating some instability. 

20 

Relatively simple 5* NTR nratations are observed in adapted pseudorevertants 

As mentioned previously, BVDV + HCVdelBlB2B3 did not replicate stably as 
indicated by the heterogeneity in the size of plaques produced by this virus. Upon passage 
and selection of medium plaque-producing variants, 5* RACE analysis and sequencing 

25 indicated that nt 1 -26 had been deleted in the pseudorevertants, removing a large portion of 
Br which was apparently deleterious in the absence of Bl. This deletion results in the 5* 
terminal sequence 5'GUAUCG which is identical to the first six bases of BVDV genome 
RNA (Fig. 5) and is repeated at positions 27-32. 

Analysis of the passaged 5' EMCV virus indicated that the replicating progeny had 

30 also undergone a simple deletion of sequence at the 5' end to generate more efficiently 

replicating variants (Fig. 5). After electroporation, the 5' EMCV virus pool was passaged 5 
times at a multiplicity of infection of 0.1-1 PFU/cell on MDBK or BT cells, and the 5' termini 
of three randomly picked plaques were sequenced. For all three plaques selected, nt 2-209 
had been deleted, again creating a genome RNA with the 5' terminal tetranucleotide sequence 

35 5'-GUAU. 
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Analysis of the 5' HCV progeny indicated that more complicated variants had arisen. 
Most small plaque-producing variants were unstable and quickly reverted to medium plaque- 
producing variants. However, one small plaque-producing variant and two stable medium 
plaque-producing variants were isolated. 5' terminal sequences of the variants were amplified 
5 by rapid amplification of cDNA ends (RACE) and cloned into a plasmid vector, and 

sequences for several independent colonies were determined. The sequence of three clones of 
the small plaque-producing virus (5*HCV.R1) contained a deletion of HCV sequence from nt 
1-34 and an addition of the dinucleotides 5 -AU in two clones and 5 -GU in the third clone. 
This creates a 5' terminus of 5'-(G/A) UAA (Fig. 5B), reminiscent of the first three bases of 

10 the BVDV genome RNA (5'-GUA). Both medium plaque variants appeared to have arisen by 
RNA recombination with non-viral sequences (Fig. 5). One medium plaque variant (5* 
HCV.R2) had deleted the first 21 bases of the HCV sequence and contained instead a 
heterologous sequence of 22 bases. BLAST searches revealed a perfect match between this 
sequence and a sequence in a human retina cDNA of unknown function (Tsp509I). The 

1 5 second medium plaque variant (5* HCV.R3) had also undergone a possible recombination 
event leading to the addition of 12 nt to the 5* end of the HCV sequence. Given its short 
length, multiple matches were found in the database with this sequence. As for the small 
plaque variant, sequencing of multiple clones revealed heterogeneity oat the extreme S' end, 
with either G of A identified as the S' base. Remarkably, for both medium plaque variants, 

20 the fused heterologous sequence began with the tetranucelotide sequence S'-(G/A) UAU (Fig. 
SB). For all three variants, sequencing of the entire S' NTR and a portion of the N^"* coding 
region revealed only these changes at the 5* termini. 

5' NTR sequence changes are sufficient for the pseudorevertant phenotypes 

25 To assess the importance of these alterations oat the 5* terminus of the 5' HCV 

pseudorevertants, derivatives of 5' HCV were created with the changes determined by 5* 
RACE (Fig. 6A) and analyzed the specific infectivities of these RNAs (Fig. 6B). 
Corresponding to the small plaque variant, a derivative called 5' HCV.Rl orig was engineered 
which contained a 5* NTR consisting of the dinucleotide 5' -GU at the 5' terminus of HCV nt 

30 35-341. This results in a 5' terminus consisting of 5'-GUAA. 5'HCV.Rl orig RNA had a 
specific infectivity at least four orders of magnitude higher than 5' HCV RNA (Figs. 6B and 
2A). This demonstrates that this 5' NTR structure is sufficient for phenotypic reversion to 
high specific infectivity. However, small plaques and considerable heterogeneity were 
observed for 5'HCV.Rl orig suggesting that additional mutations may be present in the 

35 original small plaque variant. 
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The engineered derivative 5HCV.R2orig had a 5* NTR consisting of 22 nt of 
Tsp509I-homologous sequence followed by HCV nt 22-341. Another construct, called 
5'HCV.R3orig was made, which has the 12 nt of the other heterologous sequence fused to the 
intact HCV 5' NTR. Specific infectivities for both these derivatives were essentially the same 
5 as observed for wild type BVDV RNA (2-4 x 10^ PFU/^ig; Fig. 6B). Transfection with these 
transcripts produced medium plaques, as observed for the original variants, and this 
phenotype was stable upon passaging. These results show that the altered 5'NTR sequences 
were responsible for the pseudorevertant phenotypes rather than changes elsewhere in their 
genomes. 



Addition of the tetranucleotide sequence 5*-GUAU to the HCV 5' NTR aUows efficient 
IS BVDV repHcation 

For all three 5* HCV variants studied, as well as the BVDV + HCV delBlB2B3 and 
STMCV pseudorevertants, 5' NTR alterations seemed to involve creation of a three- or four- 
base "consensus" sequence identical to the S' terminus of BVDV genome RNA. To test the 
importance of this sequence, as opposed to fused heterologous sequences, we created a set of 

20 variants with the BVDV 5' tetranucleotide sequence linked to the HCV 5' NTR or the 
deletion/recombinant break points identified during sequence analysis of the 5* HCV 
pseudorevertants (Fig. 6A). S* HCV.Rlcons had the tetranucleotide sequence S'-GUAU fused 
to HCV nt 35-341. 5'HCV.R2cons had the 5'-GUAU tetranucleotide sequence fused to HCV 
nt 22-341. S'HCV.R3cons contained the tetranucleotide sequence S'-Guau fused to the intact 

25 5* terminus of the HCV NTR. RNAs from all three of these derivatives had specific 
infectivities more than five orders of magnitude higher than 5*HCV and comparable to 
parental BVDV (Fig. 6B). 

There were, however, significant differences between the phenotypes of some of 
these derivatives versus the reconstructed pseudorevertants. As mentioned above, 

30 5*HCV.Rlorig yielded tiny and small plaques and produced low virus yields even after 48 h. 
In contrast, the addition of four bases rather than two bases (5*-GUAU vs. 5'-GU) yielded 
virus with near wild-type plaque morphology (Fig. 6B) and growth Rates (Fig. 7). In the case 
of the smaller deletion, 5*HCV.R2orig and 5'HCV.R2cons were indistinguishable, suggesting 
that, other than the 5' four bases, the fused heterologous sequences were dispensable. This 

35 was not he case, however, for the chimera containing the 5*-GUAU tetranucleotide sequence 
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fused to the intact HCV 5' NTR. 5*HCV.R3cons produced small plaques (Fig. 6B) and grew 
more slowly than 5'HCV.R3orig (Fig. 7) suggesting that the sequence/structure of the 
sequences downstream of the 5* four bases can affect replication efficiency. 



5 The tetranucleotide sequence S'-GUAU is important for efficient BVDV RNA 
accumulation 

Next, the effects of the different 5' termini on virus-specific RNA accumulation 
directly after transfection were analyzed. This allowed a direct comparison between 5*HCV 
and the reconstructed pseudorevertants as well as selected BVDV + HCV deletion constructs. 

10 MDBK cells were transfected with in vitro synthesized RNAs and labeled for 10 h beginning 
at 5 h post-transfection with ^H-UTP in the presence of actinomycin D (Fig. 8). RNA 
replication of the 5' HCV chimera was severely impaired to a level below detection (Fig. 8, 
lane 2). In contrast, every 5* NTR alteration of 5' HCV that increased RNA specific 
infectivity and allowed efficient virus growth led to readily detectable viral RNA 

15 accumulation. Addition of BF and Bl to the 5' terminus of the HCV 5* NTR restored RNA 
replication to a level -50% of fliat observed for BVDV (BVDV + HCVdelB2B3; Fig. 8, lane 
3 vs. lane 1). BVDV + HCVdelB2B3Hl displayed reduced RNA synthesis compared to 
BVDV + HCVdelB2B3 (Fig. 8, lane 4 vs. lane 3) perhaps explaining its small plaque 
phenotype and suggesting a possible positive role for HI in replication of this chimera. 

20 5*HCV.Rlorig, which had exhibited plaque heterogeneity and slow growth, accumulated less 
RNA when compared to 5'HCV.Rlcons (Fig. 8, lane 5 vs. lane 6). 5*HCV.R2orig and 
SlICV.R2cons showed similar RNA accumulation (Fig. 8, lane 9 vs. lane 10) consistent with 
their medium plaque phenotypes; and 5*HCV.R3cons exhibited reduced RNA synthesis 
compared to 5*HCV.R3orig (Fig. 8, lane 8 vs. lane 7), consistent with flieir small-versus 

25 medium-plaque phenotypes. 

Although these RNA phenotypes are complex, the most striking resuh is that addition 
of the Br Bl hairpins, addition of heterologous 5* sequences terminating with 5*-GUAU or 
simply fusion of this tetranucleotide sequence with the HCV 5' NTR or short 5' truncations of 
the HCV 5' NTR all dramatically upregulated RNA accimiulation. This occurred without 

30 increasing translation efficiency, at least as measured in a cell-free assay (Fig. 3, compare 
lanes 3-8 to lane 1), suggesting that these sequences function at the level of RNA replication 
or stability. 
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Discussion 

The work presented here helps to define the requirements for a functional BVDV 
SWrR. The BVDV-specific 5' NTR sequences required for efficient replication in cell 
culture are minimal and consist of the 5* terminal sequence, 5 -GUAU. The sequence 5'- 
5 AUAU, detected for some pseudorevertants, may also be functional but this was not tested for 
technical reasons. This simple 5*-terminal tetranucleotide sequence, which is conserved 
among pestivirses (Ruggli et al., 1996; Becher et al., 1998), was shown to function in the 
context of functional IRES elements derived from the hepacivirus HCV or the picomavirus 
EMCV. As discussed below, this may indicate that the 5* signals required for BVDV RNA 

1 0 replication are rather simple or that elements in these heterologous IRESs can functionally 
replace deleted BVDV sequences. 

Sequences at the extreme 5' end of BVDV genome RNA could modulate the 
efficiency of RNA accumulation by affecting RNA stability, translation, promoter efficiency, 
or some combination of these processes. At this time, we can not distinguish among these 

1 5 possibilities but favor an effect on RNA replication. The complement of the BVDV 5' 

sequence at the 3' end of the negative-strand RNA presumably functions in the initiation of 
positive-strand RNA synthesis. Thus, AUAC-3* at the 3'terminus fo minus-strand RNA may 
be important for positive-strand RNA synthesis. Interestingly, for some positive-strand RNA 
viruses such as rubella virus (Pugachev & Frey, 1998), flock house virus (Ball, 1994) and 

20 turnip crinkle virus (Guan et al., 1997), only minimal cw-acting sequences at the 3' termini of 
negative-strand RNAs are required positive-strand RNA synthesis. In contrast to the 5* NTR 
replacements, we were unable to generate replication-competent BVDV-HCV replacing that 
of BVDV (data not shown). This may indicate that the signals within the pestivirus 3* NTR 
required for initiation of negative-strand RNA synthesis are more complex and virus specific. 

25 Once the replication complex has assembled at the 3' NTR and transversed the RNA during 
negative-strand synthesis, the requirements of the S' NTR for initiation of positive-strand 
syndesis may be minimal. 

Although the RNA replication signals within the 5' NTR appear to be rather simple, it 
is possible that the signals important for RNA replication actually extend into the IRES and 

30 are more complicated. For instance, the 5'HCV pseudorevertants were more stable and grew 
to higher titers than the 5'EMCV counterparts, despite the fact that the 5'EMCV RNAs were 
translated more efficiently in vitro. This may indicate that the BVDV and HCV IRESs 
contain signals important for RNA synthesis that are absent in the EMCV IRES. 

It is perhaps not surprising that 5* HCV appeared to recombine with cellular mRNAs 

35 to acquire a 5' terminus with the 5* -(G/A) UAU consensus, given that non-cytopathic strains 
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of BVDV can recombine with BVDV RNA or cellular mRNAs to generate cytopathic strains 
of BVDV (Meyers & Thiel, 1996). Presumably, this recombination event involves template 
switching during negative-strand RNA synthesis, as observed for polio-virus (Kirkegaard & 
Baltimore, 1986). In contrast to 5' HCV, simple deletions of 5' terminal viral sequences could 
5 account for the BVDV + HCVdelB 1 B2B3 and 5*EMCV pseudorevertants since the 

tetranucleotide sequence is present in these 5' NTRs upstream of functional IRES elements. 
Such deletions could occur by partial degradation of positive-strand template prior to 
negative-strand synthesis, by premature termination during negative-strand RNA synthesis, or 
by degradation of 3' terminal negative-strand sequence after synthesis. It is proposed that 

10 SHCV was forced to recombine with cellular sequences because HCV does not have an 5*- 
(G/A) UAU sequence upstream of its IRES. The first occurrence of an (G/A)UAUA 
tetranucleotide sequence is at nt 94-97 within hairpin H2, and a 5' deletion extending into this 
sequence would presumably inactivate or severely impair HCV IRES activity. It is interesting 
that BVDV + HCVdelBlB2B3 and 5*EMCV pseudorevertants were generated at much higher 

1 S frequency than S'HCV pseudorevertants. This may indicate that recombination between 

BVDV and cellular RNAs is a rare event compared to the processes which lead to deletion of 
terminal viral sequences. 

Poliovirus chimeras dependent upon a functional HCV IRES have been reported (Lu 
& Wimmer, 1996). Interestingly, viable poliovirus chimeras were produced only when HCV 

20 sequences included both the IRES and the N-terminal portion of the HCV ORE. Nucleotide 
sequences or structures in the downstream ORE can modulate HCV IRES translational 
efficiency (see Reynolds et al., 199S; Honda et al., 1996a) but it was also suggested that the 
N-terminal portion of the HCV core polypeptide might be involved. In the case of our 5' 
HCV pseudorevertants, there is no requirement for HCV C protein sequences. Although the 

25 translation efficiency of the HCV IRES in the presence of additional HCV sequences 3* to the 
AUG start was not directly assessed, the HCV chimeras and pseudorevertants were 
translationally active and infectious in the absence of any portion of the HCV ORE. This 
indicates that either the HCV IRES does not extend into the HCV ORE or that the BVDV 
ORE contains analogous sequence which functions in our 5'HCV chimeras. There is some 

30 limited identity between HCV and BVDV within this region. For example, HCV nt 359-394 
and BVDV nt 405-440 are identical at 21 of 36 positions, although identity within this 
sequence may be attributed to a high adenosine content. It is interesting to note that the 
luciferase (LUC) and chloramphenicol acetyl transferase (CAT) reporter genes previously 
used to detect HCV IRES activity (Tsukiyama-Kohara et al., 1992; Wang et al., 1993) also 

35 have adenosine- or purine-rich regions in relatively the same position as the HCV ORE and 
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BVDV ORF. It this region is indeed important for IRES activity, this may explain why some 
have observed that the HCV IRES does not require a portion of the HCV ORF for translation 
of CAT or LUC (Tsukiyama-Kohara et al., 1992; Wang et al., 1 993). Point mutations and 
insertions within this region of HCV have been shown to reduce HCV IRES activity in vitro 
5 (Honda etal., 1996a,b). 

Despite the fact that BT and Bl are conserved among different strains of BVDV and 
similar hairpins are present in border disease virus and CSFV (Deng & Brock, 1993; Becher 
et al, 1998), Bl' and Bl were dispensable for BVDV rcphcation, provided that the 5* 
tetranucleotide sequence 5*-(G/A)UAU remained. This may indicate a role for BT and Bl in 

1 0 viral replication in vivo that we do not observe in cell culture. It will be interesting to test the 
phenolype of chimeras that lack Bl* and Bl in vivo to determine if they are attenuated and 
might serve as useful BVDV vaccines. In this vein, several studies widi flaviviruses have 
demonstrated that alterations in 5' NTR or 3* NTR elements can lead to attenuation in vivo 
(Cahour et al., 1995; Men et a., 1996; Mandl et al., 1998). BVDV chimeras that utilize the 

1 S HCV or EMCV IRES may also prove to be attenuated simply due to the presence of the 

heterologous IRES. For poliovirus, it has been shown that differences in IRES efficiency in 
different host-cell environments can modulate host range and virulence (Shiroki et al, 1997). 

B VDV-HCV chimeras that are dependent on a functional HCV IRES may have 
another practical application. It may be possible to use these chimeras to screen for anti-HCV 

20 therapeutics that target the HCV IRES. Other researchers have shown antisense 

oligonucleotide-mediated inhibition of HCV gene expression in hepatocytes by targeting the 
oligonucleotides to the HCV IRES (Hanecak et al., 1996). It will be of interest to measure the 
efficacy of antisense oligonucleotides or ribozymes (Lieber et al., 1996) against replicating 
virus, and these chimeras are more useful than HCV for this purpose since they are able to 

25 replicate efficiently in cell culture. BVDV is believed to be a reasonable model of HCV 
replication not only because of homology and conserved motifs within the 5' NTR but also 
because of similarities in overall genetic organization (Rice, 1996) and polyprotein processing 
strategy (Tautz et al., 1997; Xu et al., 1997). 

In view of the above, it will be seen that the several advantages of the invention are 

30 achieved and other advantageous results attained. 

As various changes could be made in the above methods and compositions without 
departing from the scope of the invention, it is intended that all matter contained in the above 
description and shown in the accompanying drawings shall be interpreted as illustrative and 
not in a limiting sense. 
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All references cited in this specification, including patents and patent applications, are 
hereby incorporated by reference. The discussion of references herein is intended merely to 
summarize the assertions made by their authors and no admission is made that any reference 
constitutes prior art. Applicants reserve the right to challenge the accuracy and pertinency of 
5 the cited references. 
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What is Claimed is: 

1 . A polynucleotide comprising a chimeric viral RNA which comprises: 

(a) a 5 ' nontranslated region (5 ' NTR); 

(b) an open reading frame (ORF) region; and 

(c) a 3 ' nontranslated region (3' NTR); 

wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV), and wherein said chimeric viral RNA is replication-competent. 

2. The polynucleotide of claim 1 , wherein the chimeric region is the 5 ' NTR and 
the first pestivirus nucleotide sequence is from a bovine viral diarrhea virus (BVDV). 

3. The polynucleotide of claim 2, wherein the BVDV nucleotide sequence is 
located at the 5 ' terminus of the chimeric 5 ' NTR and comprises 5 ' RUAU. 

4. The polynucleotide of claim 3, wterein the first HCV nucleotide sequence in 
the chimeric 5' NTR comprises an internal ribosome entry site (IRES). 

5. The polynucleotide of claim 4, wherein the ORF and the 3 ' NTR consist of 
second and third BVDV sequences. 

6. The polynucleotide of claim 5, herein the 5' terminal sequence comprises 5' 

GUAU. 

7. The polynucleotide of claim 4, wherein the ORF comprises a second HCV 
sequence encoding at least one structural protein operably linked to a second BVDV 
sequence. 

8. The polynucleotide of claim 1 , wherein the pestivirus is BVDV and the 
chimeric region is the 3' NTR. 

9. The polynucleotide of claim 8, wherein the first HCV sequence in the 
chimeric 3 ' NTR comprises the HCV 98 bp 3 ' terminal element (SEQ ID NO:X) operably 
linked to the first BVDV sequence. 
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10. A method for identifying compounds having antiviral activity against 
hepatitis C virus (HCV) comprising the steps of: 

(a) providing a first cell containing a chimeric viral RNA which is replication- 
competent in the cell, the chimeric viral nucleic acid comprising a 5' nontranslated region (5' 

5 NTR), an open reading frame (ORF) region; and a 3 ' nontranslated region (3 ' NTR); 
wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV); 

(b) providing a second cell containing the pestivirus; and 

10 (c) comparing the replication efficiency of the chimeric viral RNA acid in the 

presence and absence of a test compound to the replication efficiency of the pestivirus in the 
presence and absence of the test compound, 

wherein a greater reduction in compound-induced replication efficiency of the chimeric viral 
RNA than the pestivirus indicates the compound has anti-HCV activity. 

15 

1 1 . The method of claim 10, wherein the chimeric region is the 5 ' NTR and flie 
first pestivirus nucleotide sequence is from a bovine viral diarrhea virus (BVDV). 

12. The method of claim 11, wherein the BVDV nucleotide sequence is located 
20 at the 5 ' terminus of tiie chimeric 5 ' NTR and comprises 5 ' RUAU. 

13. The method of claim 12, wherein the first HCV nucleotide sequence in the 
chimeric 5' NTR comprises an internal ribosome entry site (IRES). 

25 14. The method of claim 13, wherein the ORF and the 3 ' NTR comprise second 

and third sequences from the BVDV. 

1 5 . The method of claim 1 0, wherein the pestivirus is BVDV and the chimeric 
region is the 3 ' NTR. 

30 

1 6. A genetically-engineered virus comprising a chimeric RNA genome which 
comprises: 

(a) a 5' nontranslated region (5' NTR); 

(b) an open reading frame (ORF) region; and 
35 (c) a 3 ' nontranslated region (3 ' NTR); 
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wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV), and wherein said chimeric RNA genome is repUcation-competent. 

5 17. The genetically-engineered virus of claim 1 6, wherein the chimeric region is 

the 5' NTR and the first pestivirus nucleotide sequence is from a bovine viral diarrhea virus 
(BVDV). 

18. The genetically-engineered virus of claim 16, wherein the BVDV nucleotide 
10 sequence is located at the 5' terminus of the chimeric 5' NTR and comprises 5' RUAU and 

the first HCV nucleotide sequence in the chimeric 5 ' NTR comprises an internal ribosome 
entry site (IRES). 

19. A vaccine against bovine viral diarrhea virus (BVDV) comprising an 

1 5 immunogenically-efFective amount of a genetically-engineered virus comprising a chimeric 
RNA genome having: 

(a) a 5 ' nontranslated region (S ' NTR); 

(b) an open reading frame (ORF) region; and 

(c) a 3 ' nontranslated region (3 ' NTR); 

20 wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from BVDV in operable linkage with a first nucleotide sequence from an hepatitis C vims 
(HCV), and wherein flie genetically-engineered virus is attenuated as compared to BVDV. 

20. The vaccine of claim 19, wherein the chimeric region is the 5 ' NTR and the 
25 BVDV nucleotide sequence is located at the 5 ' terminus of the chimeric S ' NTR and 

comprises 5 ' RUAU and the first HCV nucleotide sequence in the chimeric 5 ' NTR 
comprises an internal ribosome entry site (IRES). 

21. A polynucleotide comprising a chimeric viral RNA which comprises: 
30 (a) a 5 ' nontranslated region (5 ' NTR); 

(b) an open reading frame (ORF) region; and 

(c) a 3' nontranslated region (3' NTR); 

wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
fit)m a pestivirus in operable linkage with a heterologous nucleotide sequence and wherein 
35 said chimeric viral RNA is replication-competent. 
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FIGURE 2A 
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FIGURE 2B 
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FIGURE 4A 
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FIGURE 4B 
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FIGURE 8 
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pACNR/BVD NAUL-Xba* •> Graphic Map 

cm sequence 15065 bp gtatacgagaac . . . cgactcaccata circular 

pACNR/BVD NAm.-Xba = Haell and Xhol digest of pACNR/BVD NM>L ligated to 

Haell and Xhol digest of pACNR1180/DraIir-/BVD5* 
a/27 corrected r.t 12136 G to C to give Hpal site. 

Go 



Xba\ 15043 224 Scfl 
AmpS\ai\ 14852 
-Amp Stop 14003 
BstB\ 13764 
Sac II 13276 
A^113i 13278 
SgrM 13114 

Lpnl 13026 I 

Has II 13026 
Eco 47III 13026 
Af9\ 13026 
Pac\ 1^85 
Sdat 12577 
Sbfl 12577 
eag\ 12330 



8VD-NS5B 10193 



224 Xhol 
247 Sph\ 
386 BVD-Hpro 
610 SseB647l 
695 Ed 13611 
695 SacI 
890 BVD-C 

1196 BVD'Bmz 




1877 evO-El 



2462 BVD-E2 
2827 RsrW 



- 3584 8VD-P7 
3794 BVD-HQZ 



5179 Qrallt 
5423 0VO-NS3 



eVD-NS5A 8705 



7-7 
rtfjim 7973 

SVD-NS4B 7664 



7472 BW7-NS4A 



6016 BsmBI 
6303 BfrB\ 
6303 Afe/I 
6303 /^lOI 



FIGURE 9 
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pACNR/BVD NADL-Xba* -> Genes 

DNA sequence 15065 b.p. gtatacgagaat ... cgacccactaca circular 

pAOJR/BVD NADL-Xba = Haell and Xhol digest of pACNR/BVD NADL ligated co 

Haell and Xhol digest o£ pACNRll80/Dralll-/BVDS' 
8/27 corrected nt 12136 G co C to give Hpal site. 

Co 



321 cagcctgatagggtgctgcagaggcccactgtattgctactaaaaatctctgccgtacatggcac ATG GAG TTG 
1 MEL 

395 ATC ACA AAT GAA CTT TTA TAC AAA ACA TAC AAA CAA AAA CCC GTC GGG GTG GAG GAA CCT 
4ITNELLYKTYKQKPVGVEEP 

455 GTT TAT GAT CAG OCA GC3T GAT CCC TTA TTT GOT GAA AGO GGA GCA GTC CAC CCT CAA TCG 
24VYDOAGDPLFGERGAVHPQS 

515 ACG CTA AAG CTC CCA CAC AAG AGA GGG GAA CGC GAT GTT CCA ACC AAC TTG GCA TCC TTA 
44TLKLPHKRGERDVPTNLASL 

575 CCA AAA AGA GGT GAG TGC AGG TCG GGT AAT AOC AGA GGA CCT GIG AOC GGG ATC TAC CTG 
64PKRGDCRSGNSRGPVSG I YL 

635 AAG CCA GGG CCA CTA TIT TAC CAG CAC TAT AAA OCT CCC GTC TAT CAC AGG GCC CCG CTG 
84KPGPLFyQDYKOPVYHRAPL 

695 GAG CTC TTT GAG GAG GGA TCC ATG TGT GAA ACG ACT AAA COG ATA GGG AGA OTA ACT GGA 
104BLFEBGSMCETTKRIGRVTG 

755 AGT GAC GGA AAG CTG TAC CAC ATT TAT GTG TGT ATA GAT GGA TGT ATA ATA ATA AAA AGT 
124SDGKLYHIYVC I DGCI I I KS 

815 GCC ACG AGA AGT TAC CAA AGG GTG TTC AGG TOG GTC CAT AAT AGG CTT GAC TGC CCT CTA 
X44ATRSYQRVF RWVHNRLDCPL 

875 TOG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CAG AAA 
164WVTTCSDTKEEGATKKKTQK 

935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GAC AGC 
184PDRLERGKMKI VPKESEKDS 

995 AAA ACT AAA CCT COG GAT GCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG CTG AGG AAG 
204 KTKPPDATIVVEGVKYQVRK 

1055 AAGGGAAAAACCAAGAGTAAAAACACrCAGGACGGCTTGTACCATAACAAAAACAAACCT 
224 KGKTKSKNTQDGLYHNKNKP 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GGG TOG GCA ATA ATA GCT ATA GTT 
244 QESRKKLEKALLAWAI lAIV 

1175 TTG TTT CAA GPTT ACA ATG OGA GAA AAC ATA ACA CAG TOG AAC CTA CAA GAT AAT GGG ACG 
264 LPQVTMGENITQWNLQDNGT 

1235 GAA GGG ATA CAA 0G6 GCA ATG TTC CAA AOQ GCT GTG AAT AGA ACT TTA CAT GGA ATC TGO 
284 EGIQRAMPQRGVNRS LHG IW 

1295 CCA GAG AAA ATC TGT ACT GCT GTC CCT TCC CAT CTA GCC ACC GAT ATA GAA CTA AAA ACA 
304 PEKICTGVPSHLATDI ELKT 

1355 ATT CAT GCT ATG ATG GAT GCA ACT GAG AAG ACC AAC TAC ACG TCT TGC AGA CTT CAA CGC 
324 IHGMMDASEKTNYTCCRLQR 

1415 CAT GAG TGG AAC AAG CAT GCT TGG TGC AAC TGG TAC AAT ATT GAA CCC TGG ATT CTA CTC 
344 HEWNKHGWCNWYNI E PWI LV 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GTC ACT 
364 MNRTQANLTEGQPPRECAVT 

1535 TCT AGG TAT GAT AGG GCT ACT GAC TTA AAC GTG CTA ACA CAA GCT AGA GAT AGC CCC ACA 
3B4CRYDRASDLNVVTQARDS PT 

1595 CCC TTA ACA OCT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATG CGG GGC 
404 PLTGCKKGKNFSFAG I LMRG 







160 




240 
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394 
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454 




23 
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634 
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754 
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1655 CCC TGC AAC TTT GAA ATA OCT GCA AGT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT ACT 
424 PCNFE lAASOVLFKEH ER I S 



X714 
443 



1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA OGT GCC 1774 

444 MFQDTTLYLVDGLTNSLEGA 463 

1775 AGA CAA GGA ACC OCT AAA CTG ACA ACC TGG TTA GGC AAG CAG CTC GOG ATA CTA GGA AAA 1B34 

464 RQGTAKLTTWLGKOLG I LGK 483 

1835 AAG TTC GAA AAC AAG AGT AAG ACG TOG TTT GGA GCA TAC GCT GCT TCC CCT TAC TGT GAT 1894 

484 KLENKSKTWFGAYAASPYCD 503 



1895 GTC GAT CGC AAA ATT GGC TAC ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TCC TTA CCC 
504 VDRKIGY IWYTKNCTPACLP 



1954 
523 



1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT GAG ACC AAT GCA GAG GAC GGC AAG ATA 2014 

524 KNTK I VG PGKFDTNAEDGX I 543 



2015 TTA CAT GAG ATC GGG OCT CAC TTC TCG GAG GTA CTA CTA CTT TCT TTA GTC GTG CTC TCC 
544 LHEMGGHLSEVLLLSLVVLS 



2074 
563 



2075 GAC TTC OCA CCG GAA ACA GCT AGT GTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564DPAPETASVMYLILHFSIPQ 583 

2135 AGT CAC GTT GAT GTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTG GAG CTG ACA 2194 

584 SHVDVMDCDKTQLNLTVELT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TOG GTC TGG AAT CTA GGC AAA TAT GTA TGT ATA AGA CCA 2254 

604 TAEVIPGSVWNLGKYVCIRP 623 

2255 AAT TOG TGQ CCT TAT GAG ACA ACT GTA GTG TTG GCA TTT GAA GAG GTG AOC CAG GTG (TTG 2314 

624 NWWPYETTVVLAFEEVS0VV 643 

2315 AAG TTA GTC TTC AQG GCA CTC AGA GAT TTA ACA CGC ATT TOG AAC GCT OCA ACA ACT ACT 2374 

644 KLVLRALRDLTRIWNAATTT 663 

2375 GCT TTT TTA GTA TGC CTT GTT AAG ATA GTC AGG GGC CAG ATC GTA CAG GGC ATT CTC TGG 2434 

664 AFLVCLVKIVRGQMVQGILW 683 

2435 CTA CTA TTG ATA ACA GOG GTA CAA GOG CAC TTO GAT TGC AAA OCT GAA TTC TCG TAT GCC 2494 

684 LLLITGVOGHLDCKPBFSYA 703 

2495 ATA GCA AAG GAC GAA AGA ATT GCT CAA CTC GOG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 

704 IAKDERIGQLGAEGLTTTWK 723 

2555 GAA TAC TCA CCT GGA ATC AAG CTC GAA GAC ACA ATC GTC ATT GCT TGG TGC GAA GAT GGG 2614 

724 EYSPGMKLEDTMVIAWCEDG 743 

2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTC CAT ACA 2674 

744 KLMYLQRCTRETRYLAILHT 763 

2675 AGA GCC TTG COG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG CGA AAG CAA GAG GAT 2734 

764 RALPTSVVFKKLFDGRKQED 783 

2735 GTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 

784 VVEMNDNFEFGLCPCDAKPI 803 

2795 CTA AGA GGG AAG TTC AAT ACA AOOCTGCTQAACQGACOOGCCTTCCAGATGGTATCSCCCC 2854 

804 VRGKPNTTLLNGPAFQMVCP 823 

2855 ATA GGA TGQ ACA GGG ACT CTA AGO TCT ACG TCA TTC AAT ATC GAC ACC TTA GCC ACA ACT 2914 

824 IGWTGTVSCTSFNMDTLATT 843 

2915 OtO GTA OGG ACA TAT AGA AQG TCT AAA CCA TTC CCT CAT AGG CAA GGC TCT ATC ACC CAA 2974 

844 VVRTYRRSKPFPHRQGCITQ 863 

2975 AAO AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TCT GTC CCT 3034 

864 KNLGEDLHNCILGGNWTCVP 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TCT GGC TAT CAA 3094 

B84GDQLLYKGGSIESCKWCGYQ 903 

3095 TTT AAA GAG ACT GAG GGA CTA CCA CAC TAC CCC ATT GGC AAG TCT AAA TTG GAG AAC GAG 3154 

904 FKESEGLPHYPIGKCKLENE 923 

3155 ACT GOT TAC AGG CTA CTA GAC ACT ACC TCT TGC AAT AGA GAA GCT GTC GCC ATA CTA CCA 3214 
924 TGYRLVDSTSCNREGVAIVP 943 

3215 CAA GOO ACA TTA AAO TGC AAG ATA GGA AAA ACA ACT CTA CAG GTC ATA GCT ATC GAT ACC 3274 
944 0GTLKCKIGKTTVQVIAMDT 963 

3275 AAA CTC OGA CCT ATO OCT TGC AGA CCA TAT GAA ATC ATA TCA ACT GAG 000 CCT CTA GAA 3334 
964 KLGPMPCRPYEIISSEGPVE 983 

3335 AAG ACA OCG TCT ACT TTC AAC TAC ACT AAO ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 
984 KTACTFNYTKTLKNKYFEPR 1003 
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3395 GAC AGC TAC TTT CAG CAA TAC ATG CTA AAA GGA GAG TAT CAA TAC TGG ITT GAC CT6 GAG 3454 

1004 DSYFQQYMLKGEYQyWFDLB 1023 

3455 CPG ACT GAC CAT CAC CGG GAT TAC TTC GCT GAG TCC ATA TTA GTG GTQ GTA CTA GCC CTC 3514 

1024 VTDHHRDYFA£SILVVVVAL 1043 

3515 TTG GCT QGC AGA TAT CTA err TOG TTA CTG GTT ACA TAC ATG GTC TPA TCA GAA CAG AAG 3574 

1044 LGGRYVLWLLVTYMVLSEQK 1063 

3575 GCC TTA GGG ATT CAG TAT GGA TCA GOG GAA GTG GTG ATG ATG GGC AAC TTG CTA ACC CAT 3634 

1064 ALGIQYGSGEVVMMGNLLTH 1083 

3635 AAC AAT ATT GAA GTG GTG ACA TAC TTC TTG CTG CTG TAC CTA CTG CTG AGG GAG GAG AGC 3694 

1084 NNIEVVTYFLLLYLLLREES 1103 

3695 CTA AAG AAG TGG GTC TTA CTC TTA TAC CAC ATC TTA GTG CTA CAC CCA ATC AAA TCT CTA 3754 

1104 VKKWVLLLYHILVVHPIKSV 1123 

3755 ATT GTG ATC CTA CTG ATG ATT GGG GAT GTG CTA AAG GCC GAT TCA GGG GGC CAA GAG TAC 3814 

1124 IVI LLMIGDVVKADSGGQEY 1143 

3815 TTG GOG AAA ATA GAC CTC TCT TTT ACA ACA CTA CTA CTA ATC GTC ATA GCT TTA ATC ATA 3874 

1144 LGKIDLCPTTVVLIVIGLI I 1163 

3875 GCC AGG COT GAC OCA ACT ATA GIG CCA CTG CTA ACA ATA ATG GCA OCA CTG AOG GTC ACT 3934 

1164 ARRDPTIVPLVTIMAALRVT 1183 

3935 GAA CTG ACC CAC CAG CCT GGA GTT GAC ATC GCT GTG GCG GTC ATG ACT ATA ACC CTA CTG 3994 

1184 ELTHQPGVDIAVAVHTITLL 1203 

3995 ATO GTT AGC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA TGG TTA CAG TOC ATT CTC AGC 4054 

1204 MVSYVTDYFRYKKWLQCX LS 1223 

4055 CTG CTA TCT GCG GTG TTC TTG ATA AGA AGC CTA ATA TAC CTA GCT AGA ATC GAG ATG CCA 4114 

1224 LVSAVFLIRSLIYLGRIEMP 1243 

4115 GAG CTA ACT ATC CCA AAC TGG AGA CCA CTA ACT TTA ATA CTA TTA TAT TTG ATC TCA ACA 4174 

1244 EVTIPNWRPLTLILLYLI ST 1263 

4175 ACA ATT CTA ACG AOG TOG AAG GTT GAC GTG GCT GGC CTA TTG TTG CAA TCT (TTG OCT ATC 4234 

1264 TIVTRWKVDVAGLLLQCV PI 1283 

4235 TTA TTG CTG CTC ACA ACC TTG TGG GCC GAC TTC TTA ACC CTA ATA CTG ATC CTG CCT ACC 4294 

1284 LLLVTTLWADFLTLILILPT 1303 

4295 TAT GAfc TTG GTT AAA TTA TAC TAT CTG AAA ACT GTT AGG ACT GAT ATA GAA AGA AGT TGG 4354 

1304 YELVKLYYLKTVRTDIERSW 1323 

4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT GAT GAG ACT GGA GAG 4414 

1324 L6GIDYTRVDS lYDVDESGE 1343 

4415 GGC CTA TAT CTT TTT CCA TCA AOG CAG AAA OCA CAG GGG AAT TTT TCT ATA CTC TTG OOC 4474 

1344 GVYLFPSRQKAQGNPSILLP 1363 

4475 CTT ATC AAA GCA ACA CTG ATA ACT TOC GTC AGC ACT AAA TOG CAG CTA ATA TAC ATG ACT 4534 

1364 LIKATLISCVSSKWQLIYMS 1383 

4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AOG AAA GTT ATA GAA GAG ATC TCA GGA 4594 

1384 YLTLDFMYYMHRKVIEEI SG 1403 

4595 OCT ACC AAC ATA ATA TCC AGG TTA GTG GCA GCA CTC ATA GAG CTG AAC TGG TCC ATG GAA 4654 

1404 GTNIISRLVAALIELNWSME 1423 

4655 GAA GAG GAG AGC AAA GGC TTA AAG AAG TTT TAT CTA TTG TCT OGA AOG TTG AGA AAC CTA 4714 

1424 EEESKGLKKFYLLSGRLRNL 1443 

4715 ATA ATA AAA CAT AAG CTA AGG AAT GAG ACC GTG GCT TCT TOG TAC GOG GAG GAG GAA GTC 4774 

1444 IIKHKVRNETVASWYGEEEV 1463 

4775 TAC GOT ATG CCA AAG ATC ATG ACT ATA ATC AAG GCC ACT ACA CTG ACT AAG AGC AGO CAC 4834 

1464 YGMPKIMTI IKASTLSKSRH 1483 

4835 TOC ATA ATA TGC ACT CTA TCT GAG GGC CGA GAG TGG AAA GCT GGC ACC TGC CCA AAA TCT 4894 

1484 CIICTVCEGREWKGGTCPKC 1503 

4895 GGA CGC CAT GGG AAG CCG ATA ACG TCT 000 ATG TCG CTA GCA GAT TTT GAA GAA AGA CAC 4954 

1504 GRHGKP ITCGMSLADFEERH 1523 

4955 TAT AAA AGA ATC TTT ATA AGG GAA GGC AAC TTT GAG GOT ATG TOC AGC CGA TGC CAG GGA 5014 

1524 YKR IPI REGNFEGMCSRCQO 1543 

5015 AAG CAT AGG AOG TTT GAA ATG GAC COG GAA CCT AAG ACT GCC AGA TAC TCT OCT GAG TCT 5074 

1544 KHRRFEMDREPKSARYCA.BC 1563 



5075 AAT AOG CTG CAT OCT GCT GAG GAA GCT GAC TTT TOG GCA GAG TCG AGC ATG TTG GGC CTC 5134 
1564 NRLHPAEEGDFWAESSMLGL 1583 
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S135 AAA ATC ACC TAC TTT GCG CTG ATG GAT GGA AAG GTG TAT GAT ATC ACA GAG TGG GCT GGA 5194 

1584 KITYFALMDGKVYDITEWAG 1603 

5195 TtX: CAG CGT GTTG GGA ATC TCC CCA GAT ACC CAC AGA GTC CCT TGT CAC ATC TCA 5254 

1604 CORVGISPDTHRVPCHISFG 1623 

5255 TCA CGG ATG OCT TTC AGG CAG GAA TAC AAT GGC TTT GTA CAA TAT ACC OCT AGG GGG CAA 5314 

1624 SRMPFRQEYNGFVOYTARGO 1643 



5315 CTA rrr CTG AGA AAC TTG CCC GTA CTG GCA ACT AAA GTA AAA ATG CTC ATG GTA GGC AAC 
1644 LFLRNLPVLATKVKMLMVGN 



5374 
1663 



5375 CTT GGA GAA GAA ATT GGT AAT CTG GAA CAT CTT GOG TOO ATC CTA AGG GGG CCT GCC CTG 5434 
1664 LGEEIGNLEHLGWILRGPAV 1683 

5435 TCT AAG AAG ATC ACA GAG CAC GAA AAA TGC CAC ATT AAT ATA CTG GAT AAA CTA ACC GCA 5494 
1684 CKKITEHEKCHINILDKLTA 1703 

5495 TTT TTC GOG ATC ATG CCA AGO GGG ACT ACA CCC AGA GCC CCG GTG AGG TTC CCT ACG AGC 5554 
1704 FFGIMPRGTTPRAPVRFPTS 1723 

5555 TTA CTA AAA GTO AGG AGG GGT CFG GAG ACT GCC TGG GCT TAC ACA CAC CAA GGC GGG ATA 5614 
1724 LLKVRRGLETAWAYTHQGGI 1743 

5615 AST TCA CnC GAC CAT CTA ACC GCC GGA AAA GAT CTA CTG GTC TGT GAC AGC ATG GGA GGA 5674 
1744 SSVDHVTAGKDLLVCDSMOR 1763 

5675 ACTAOAGTCGTTTGCCAAAGCAACAACAGOTTOACCQATGAGACAQAGTATQGCGTCAAG 5734 
1764 TRVVCQSNNRLTDETEYGVK 1783 

5735 ACT GAC TCA GGG TGC CCA GAC GOT GCC AGA TCT TAT GTG TTA AAT CCA GAG GCC GTT AAC 5794 
1784 TDSGCPDGARCYVLNPEAVN 1803 

5795 ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC CTC CAA AAG ACA OCT GGA GAA TTC ACG TGT 5854 
1804 ISGSKGAVVHLQKTGGEFTC 1823 

5855 GTCACCGCATCAQGCACACCGGCTTTCTTCGACCTAAAAAACTTGAAAGGATGGTCAGGC 5914 
1824 VTASGTPAFFDLKNLKGWSG 1843 

5915 TTC CCT ATA TTT GAA GCC TCC AGC GGG AGO GTO GTT GGC AGA GTC AAA GTA GOG AAG AAT 5974 
1844 LPIFEASSGRVVGRVKVGKN 1863 

5975 GAA GAG TCT AAA CCT ACA AAA ATA ATG AGT GGA ATC CAG ACC GTC TCA AAA AAC AGA GCA 6034 
1864 EESKPTKIMSGIQTVSKNRA 1883 

6035 GAC CTO ACC GAG ATC GTC AAG AAG ATA ACC AGC ATC AAC AGG GGA GAC TTC AAG CAG ATT 6094 
1884 DLTEMVKKITSMNRGDFKQI 1903 

6095 ACT TIG GCA ACA GGG OCA GGC AAA ACC ACA GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA 6154 
1904 TLATGAGKTTELPKAVIEEI 1923 

6155 GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA CCA TTA AGOQCAGOOGCAGAGTCAGTCTAC 6214 
1924 GRHKRVLVLIPLRAAAESVY 1943 

6215 CAG TAT ATC AGA TTG AAA CAC CCA AGC ATC TCT TTT AAC CTA AGG ATA GGG GAC ATG AAA 6274 
1944 QYMRLKHPSISFMLRIGDMK 1963 

6275 GAG GGG GAC ATC GCA ACC GGG ATA ACC TAT GCA TCA TAC GOG TAC TTC TGC CAA ATC CCT 6334 
1964 EGDMATGITYASYGYFCQMP 1983 

6335 CAA CCA AAG CTC AGA GCT GCT ATC CTA GAA TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT 6394 
1984 QPKLRAAMVEYSYIPLDEYH 2003 

6395 TCTGCCACTCCTGAACAACPGGCAATrATCGOaAAOATCCACAOATrT TCA GAG AGT ATA 6454 
2004 CATPEQLAIIGKIHRFSESI 2023 

6455 AGG GTT GTC GCC ATC ACT GCC ACG CCA GCA GGG TCG GTC ACC ACA ACA GOT CAA AAG CAC 6514 
2024 RVVAMTATPAGSVTTTGQKH 2043 

6515 CCA ATA GAG GAA PTC ATA GCC CCC GAG GTA ATC AAA GGG GAG GAT CTT GGT AGT CAG TTC 6574 
2044 PIEEFIAPEVMKGEDLGSQF 2063 

6575 CTT GAT ATA GCA GGG TTA AAA ATA OCA GTC GAT GAG ATC AAA GGC AAT ATC TTG GTT TTT 6634 
2064 LDIAGLKTPVDEMKGNMLVF 2083 

6635 CTA CCA ACG AOA AAC ATO GCA GTA GAG CTA GCA AAG AAG CTA AAA GCT AAG GGC TAT AAC 6694 
2084 VPTRNMAVEVAKKLKAK GYN 2103 

669S TCT GGA TAC TAT TAC AGT GGA GAG GAT OCA GCC AAT CTC AGA GTT GTC ACA TCA CAA TCC 6754 
2104 SGYYYSGEDPAKLRVVTSQS 2123 

6755 CCC TAT CTA ATC CTG GCT ACA AAT GCT ATT GAA TCA GGA GTC ACA CTA CCA GAT TTG GAC 6814 
2124 pyVIVATNAI ESGVTLPDLD 2143 

6815 ACG GTT ATA GAC ACG GGG TTC AAA TCT GAA AAG AGG GTC AGG CTA TCA TCA AAG ATA CCC 6874 
2144 TVIDTGLKCEKRVRVSSKI P 2163 
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6875 TTC ATC GTA ACA GGC CTT AAG AOG ATG GCC GTG ACT GTG GGT GAG CAG GCG CAG CX?P AGG 6934 

2164 FIVTGLKRMAVTVGEQAQRR 2183 

6935 GGC AGA GTA GGT AGA GTG AAA CCC GOG AGG TAT TAT AGG AGC CAG GAA ACA GCA ACA OGG 6994 

2184 GRVGRVKPGRYYRSQETATG 2203 

6995 TCA AAG GAC TAC CAC TAT GAG CTC TTG CAG GCA CAA AGA TAC GGG ATT GAG GAT GGA ATC 7054 

2204 SKDYHYDLLQAORYG lEDGI 2223 

7055 AAC GTG ACG AAA TCC TTT AGG GAG ATG AAT TAC GAT TOG AGC CTA TAC GAG GAG GAC AGC 7114 

2224 NVTKSFREMNYDWSLYEEDS 2243 

7115 CTA CTA ATA ACC CAG CTG GAA ATA CTA AAT AAT CTA CTC ATC TCA GAA GAC TTG CCA GCC 7174 

2244 LLITQLEILNNLLISEDLPA 2263 

7175 GCT GTT AAG AAC ATA ATG GCC AGG ACT GAT CAC CCA GAG CCA ATC CAA CTT GCA TAC AAC 7234 

2264 AVKNIMARTDHPEPIQLAYN 2283 

7235 AGC TAT GAA GTC CAG GTC CCG GTC CTG TTC CCA AAA ATA AGG AAT GGA GAA GTC ACA GAC 7294 

2284 SYEVQVPVLFPKIRNGEVTD 2303 

7295 ACC TAC GAA AAT TAC TCQ TTT CTA AAT GCC AGA AAG TTA GGG GAG GAT GTG CCC GTG TAT 7354 

2304 TYENYSFLNARKLGEDVPVY 2323 

7355 ATC TAC GCT ACT GAA GAT GAG GAT CTG GCA GTT GAC CTC TTA GGG CTA GAC TGG CCT GAT 7414 

2324 XYATEDEDLAVDLLGLDWPD 2343 

7415 CCT GGG AAC CAG CAG GTA GTG GAG ACT GCTT AAA GCA CTG AAG CAA GTG ACC GGG TTG TCC 7474 

2344 PGNQQVVETGKALKQVTGLS 2363 

7475 TCG OCT GAA AAT GCC CTA CTA GTG GCT TTA TTT QG3 TAT OTQ GOT TAC CAG GCT CTC TCA 7534 

2364 SAENALLVALFGYVGYQALS 2383 

7535 AAG AGG CAT GTC CCA ATG ATA ACA GAC ATA TAT ACC ATC GAG GAC CAG AGA CTA GAA GAC 7594 

2384 KRHVPMITDIYTIEDQRLED 2403 

7595 ACC ACC CAC CTC CAG TAT GCA CCC AAC GCC ATA AAA ACC GAT GGG ACA GAG ACT GAA CTG 7654 

2404 TTHLQYAPNAIKTDOTETEL 2423 

7655 AAA GAA CTG GCG TCG GGT GAC GTIG GAA AAA ATC ATG GGA GCC ATT TCA GAT TAT GCA GCT 7714 

2424 KELASGDVEKIMGAISDYAA 2443 O 

7715 GOG GGA CTG GAG TTT GTT AAA TCC CAA GCA GAA AAG ATA AAA ACA GCT CCT TTG TTT AAA 7774 m 

2444 QOLEFVKSQAEKIKTAPLFK 2463 g 

7775 GAA AAC GCA GAA GCC GCA AAA GGG TAT GTC CAA AAA TTC ATT GAC TCA TTA ATT GAA AAT 7834 ^ 

2464 ENAEAAKGYVQKFZDSLIEN 2483 ^ 

7835 AAA GAA GAA ATA ATC AGA TAT GGT TTG TGG GGA ACA CAC ACA GCA CTA TAC AAA AGC ATA 7894 

2484 KEEIZRYGLWGTHTALYKSI 2503 

7895 GCT GCA AGA CTG GGG CAT GAA ACA GCG TTT GCC ACA CTA GTG TTA AAG TGG CTA GCT TTT 7954 

2504 AARLGHETAFATLVLKWLAF 2523 

7955 GGA GGG GAA TCA GTG TCA GAC CAC GTC AAG CAG GCG GCA GTT GAT TTA GTG GTC TAT TAT 8014 

2524 GGESVSDHVKQAAVDLVVYY 2543 

8015 GTG ATG AAT AAG CCT TCC TTC CCA GGT GAC TCC GAG ACA CAG CAA GAA GGG AGG CGA TTC 8074 

2544 VMNKPSFPGDSETQQEGRRF 2563 

8075 GTC GCA AGC CTG TTC ATC TCC GCA CTG GCA ACC TAC ACA TAC AAA ACT TGG AAT TAC CAC 8134 

2564 VASLFI SALATYTYKTWNYK 2583 

8135 AATCTCTCTAAACTGGTGGAACCAGCCCTGGCTTACCTCCCCTAT GCT ACC AGC GCA TTA 8194 

2584 NLSKVVEPALAYLPYATSAL 2603 

8195 AAA ATG TTC ACC CCA ACG CGG CTG GAG AGC GTG GTG ATA CTG AGC ACC ACG ATA TAT AAA 8254 

2604 KMFTPTRLESVVILSTTIYK 2623 

8255 ACA TAC CTC TCT ATA AGG AAG GGG AAG ACT GAT GGA TTG CTG GGT ACG GGG ATA AGT GCA 8314 

2624 TYLSIRKGKSDGLLGTOI SA 2643 

8315 GCC ATG GAA ATC CTG TCA CAA AAC CCA GTA TCG GTA GOT ATA TCT GTG ATG TTG GGG GTA 8374 

2644 AKEILSQNPVSVGISVMLGV 2663 

8375 GOG GCA ATC GCT GCG CAC AAC GCT ATT GAG TCC ACT GAA CAG AAA AGG ACC CTA CTT ATG 8434 

2664 GAIAAHNAIESSEQKRTLLM 2683 

8435 AAG OTG TTT OTA AAG AAC TTC TTG GAT CAG GCT GCA ACA GAT GAG CTG OTA AAA GAA AAC 8494 

2684 KVFVKNFLDQAATDELVKEN 2703 

8495 OCA GAA AAA ATT ATA ATG GCC TTA TTT GAA GCA GTC CAG ACA ATT GOT AAC CCC CTG AGA 8554 

2704 PEKIIMALFEAVQTIGNPLR 2723 

8555 CTA ATA TAC CAC CTG TAT GGG GTT TAC TAC AAA GGT TGG GAG GCC AAG GAA CTA TCT GAG 8614 

2724 LIYHLYGVYYKGWEAKELSE 2743 
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8615 AGG ACA GCA GGC AGA AAC TTA TTC ACA TTG ATA ATG TTT GAA GCC TTC GAG TTA TTA GGG 8674 

2744 RTAGRNLFTLIMFEAFELLG 2763 

8675 ATC GAC TCA CAA GGG AAA ATA AGG AAC CTG TCC GGA AAT TAC ATT TTG GAT TTG ATA TAC 8734 

2764 MDSQGKIRNLSGNYI LDLIY 2783 

8735 GGC CTA CAC AAG CAA ATC AAC AGA GGG CTG AAG AAA ATG GTA CTG GGG TGG GCC CCT GCA 8794 

2784 GLHKQINRGLKKMVLGWAPA 2803 

8795 CCC TIT ACTT TGT GAC TGG ACC CCT ACT GAC GAG AGG ATC AGA TTG CCA ACA GAC AAC TAT 8854 

2804 PFSCDWTPSDERIRLPTDNY 2823 

8855 TTC AGG GTA GAA ACC AGG TOC CCA TGT GGC TAT GAG ATG AAA GOT TTC AAA AAT GTA GGT 8914 

2824 LRVETRCPCGYEMKAFKNVG 2843 

8915 GGC AAA CTT ACC AAA GTG GAG GAG AGC GGG CCT TTC CTA TCT AGA AAC AGA CCT GGT AGG 8974 

2844 GKLTKVEESGPFLCRNRPGR 2863 

8975 GGA CCA GTC AAC TAC AGA GTC ACC AAG TAT TAC GAT GAC AAC CTC AGA GAG ATA AAA CCA 9034 

2864 GPVNYRVTKYYDDNLREIKP 2883 

9035 GTTA GCA AAG TTG GAA GGA CAG GTA GAG CAC TAC TAC AAA GGG GTC ACA GCA AAA ATT GAC 9094 

2884 VAKLEGQVEHYYKGVTAKID 2903 

9095 TACAGTAAAGGAAAAATGCTCTTGGCCACTGACAAGTGGGAG GTG GAA CAT GGT €?rC ATA 9154 

2904 YSKGKMLLATDKWEVEHGVI 2923 

9155 ACC AGG TTA GCT AAG AGA TAT ACT GOG GTC GGG TTC AAT GCrr OCA TAC TTA GGT GAC GAG 9214 

2924 TRLAKRYTGVGFNGAYLGDE 2943 

9215 CCC AAT CAC CGT GCT CTA GTG GAG AGG GAC TCT GCA ACT ATA ACC AAA AAC ACA GTA CAG 9274 

2944 PNHRALVERDCATITKNTVQ 2963 

9275 TTT CTA AAA ATG AAG AAG GGG TCT GCG TTC ACC TAT GAC CTG ACC ATC TCC AAT CTG ACC 9334 

2964 FLKMKKGCAFTYDLTISNLT 2983 

9335 AOG CTC ATC GAA CTA GTA CAC AGG AAC AAT CTT GAA GAG AAG GAA ATA CCC ACC OCT ACG 9394 

2984 RLIELVHRNNLEEKEIPTAT 3003 



9395 CTC ACC ACA TGG CTA GCT TAC ACC TTC GTG AAT GAA GAC OTA GGG ACT ATA AAA CCA CTA 
3004 VTTWLAYTFVNEDVGTI KPV 



9454 

3023 



9455 CTA GGA GAG AGA CTA ATC CCC GAC CCT CTA GTT GAT ATC AAT TTA CAA CCA GAG GTG CAA 9514 

3024 LGERVIPDPVVDINLQPEVQ 3043 

9515 GTO GAC ACG TCA GAG GTT GGG ATC ACA ATA ATT GGA AGG GAA ACC CTG ATG ACA ACG GGA 9574 

3044 VOTSEVGITI IGRETLMTTG 3063 

9575 CTG ACA CCT GTC TTG GAA AAA CTA GAG CCT GAC GCC AGC GAC AAC CAA AAC TCG GTG AAG 9634 

3064 VTPVLEKVEPDASDNQNSVK 3083 

9635 ATC GGG TTG GAT GAG GCT AAT TAC CCA GGG CCT GGA ATA CAG ACA CAT ACA CTA ACA GAA 9694 

3084 IGLDEGNYPGPGIQTHTLTE 3103 

9695 GAA ATA CAC AAC AOG GAT GOG AGG CCC TTC ATC ATG ATC CTG GGC TCA AGG AAT TCC ATA 9754 

3104 EIHNRDARPFIMILGSRNSI 3123 

9755 TCA AAT AGG GCA AAG ACT GCT AGA AAT ATA AAT CTG TAC ACA GGA AAT GAC CCC AGG GAA 9814 

3124 SNRAKTARNINLYTGNDPRE 3143 

9815 ATA CGA GAC TTO ATG OCT GCA GGG CGC ATG TTA CTA CTA GCA CTG AGG GAT GTC GAC CCT 9874 

3144 IRDLMAAGRMLVVALRDVDP 3163 

9875 GAG CTC TCT GAA ATC GTC GAT TTC AAG GOO ACT TIT TTA GAT AGO GAG OCC CTG GAG GCT 9934 

3164 ELSEMVDFKGTFLDREALEA 3183 

9935 CTA ACT CTC GGG CAA CCT AAA CCG AAG CAG GTT ACC AAG GAA GCT GTT AGG AAT TTG ATA 9994 

3184 LSLGOPKPKQVTKEAVRNLI 3203 

9995 GAA CAG AAA AAA GAT OTQ GAG ATC CCT AAC TGG TTT GCA TCA GAT GAC CCA CTA TIT CTG 10054 

3204 EQKKDVEIPNWFASDDPVFL 3223 

10055 GAA CTG GCC TTA AAA AAT GAT AAG TAC TAC TTA OTA GGA GAT GTT GGA GAG CTA AAA GAT 10114 

3224 EVALKNDKYYLVGDVGELKD 3243 

10115 CAA GCT AAA OCA CTT GGG GCC AOG GAT CAG ACA AGA ATT ATA AAG GAG CTA GGC TCA AGG 10174 

3244 QAKALGATDQTRIIKEVGSR 3263 

10175 AOG TAT GCC ATC AAG CTA TCT AGC TGG TTC CTC AAG GCA TCA AAC AAA CAG ATC ACT TTA 10234 

3264 TYAMKLSSWFLKASNKQMSL 3283 

10235 ACTCCACTCTTTGAGGAATTGTrGCTAOGGTOCCCACCTGCAACTAAGAGCAATAAOOGG 10294 

3284 TPLFEELLLRCPPATKSNKG 3303 

10295 CAC ATC GCA TCA GCT TAC CAA TTC GCA CAG GCT AAC TGG GAG CCC CTC GCT TGC GGG GTC 10354 

3304 HMASAYQLAQGNWEPLGCGV 3323 
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10355 CAC CTA GOT ACA ATA CCA GCC AGA AGG GTG AAG ATA CAC CCA TAT GAA GCT TAC CTG AAG 10414 

3324 HLGTIPARRVK IHPYEAYLK 3343 

10415 TTG AAA GAT TTC ATA GAA GAA GAA GAG AAG AAA CCT AGG GTT AAG GAT ACA OTA ATA AGA 10474 

3344 LKDFIEEEEKKPRVKDTVIR 3363 

10475 GAG CAC AAC AAA TGG ATA CTT AAA AAA ATA AGG TTT CAA GGA AAC CTC AAC ACC AAG AAA 10534 

3364 EHNKWILKKIRFQGNLNTKK 33S3 

10535 ATG CTC AAC CCG GGG AAA CTA TCT GAA CAG TTG CAC AGG GAG GGG CGC AAG AGG AAC ATC 10594 

3384 MLNPGKLSEQLDREGRKRNI 3403 

10595 TAC AAC CAC CAG ATT GGT ACT ATA ATG TCA AGT GCA GGC ATA AGG CTG GAG AAA TTG CCA 10654 

3404 YNHQIGTIMSSAGIRLEKLP 3423 

10655 ATA GTG AGG GCC CAA ACC GAC ACC AAA ACC TTT CAT GAG GCA ATA AGA GAT AAG ATA GAC 10714 

3424 IVRAQTDTKTFHEAIRDKID 3443 

10715 AAG AGT GAA AAC CGG CAA AAT CCA GAA TTG CAC AAC AAA TTG TTG GAG ATT TTC CAC ACG 10774 

3444 KSENRQNPELHNKLLEIFHT 3463 

10775 ATA GCC CAA CCC ACC CTG AAA CAC ACC TAC GGT GAG GTG ACG TGG GAG CAA CTT GAG GCG 10834 

3464 lAQPTLKHTYGEVTWEQLEA 34B3 

10835 GGG ATA AAT AGA AAG GGG GCA GCA OGC TTC CTG GAG AAG AAG AAC ATC GGA GAA OTA TTG 10894 

3484 GINRKGAA6FLEKKNZGEVL 3503 

10895 GAT TCA GAA AAG CAC CTG OTA GAA CAA TTG GTC AGG GAT CTG AAG GCC OGQ AGA AAG ATA 10954 

3504 DSEKHLVEQLVRDLKAGRKI 3523 

10955 AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT GAG AAG AGA GAT OTC AOT GAT GAC TGG CAG 11014 

3524 KYYETAI PKNEKRDVSDDWQ 3543 

11015 GCA GGG GAC CTG GTG GTT GAG AAG AGG CCA AGA GTT ATC CAA TAC CCT GAA GCC AAG ACA 11074 

3544 AGDLVVEKRPRVIQYPEAKT 3563 

11075 AOG CTA GCC ATC ACT AAG GTC ATG TAT AAC TGG GTG AAA CAG CAG CCC GTT GTG ATT CCA 11134 

3564 RLAITKVMYNWVKQQPVVI P 3583 ^ 

r** 

11135 GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC ATC TTT GAT AAA OT6 AGA AAG GAA TGG GAC 11194 ^ 

3584 GYEGKTPLFNI FDKVRKEWD 3603 i^ 

11195 TCG TTC AAT GAG CCA GTG GCC OTA AOT TTT GAC ACC AAA GCC TGG GAC ACT CAA GTG AOT 11254 

3604 SFNEPVAVSFDTKAWDTQVT 3623 

11255 AOT AAG GAT CTG CAA CTT ATT GGA GAA ATC CAG AAA TAT TAC TAT AAG AAG GAG TGG CAC 11314 

3624 SKDLQLIGEIQKYYYKKEWK 3643 

11315 AAG TTC ATT GAC ACC ATC ACC GAC CAC ATG ACA GAA OTA CCA GTT ATA ACA GCA GAT GGT 11374 

3644 KFIDTITDHMTEVPVZTADG 3663 

11375 GAA OTA TAT ATA AGA AAT GGG CAG AGA GGG AGC GGC CAG CCA GAC ACA ACT OCT GGC AAC 11434 

3664 EVYIRNGQ RGSGQPDTSAGN 3683 

11435 AGC ATG TTA AAT GTC CTG ACA ATG ATG TAC GGC TTC TGC GAA AGC ACA GGG OTA CCG TAC 11494 

3684 SMLNVLTMMYGFCESTGVPY 3703 

11495 AAG AOT TTC AAC AGG GTG GCA AOG ATC CAC GTC TOT GGG GAT GAT GGC TTC TTA ATA ACT 11554 

3704 KSFNRVARI HVCGDDGFLIT 3723 

11555 GAA AAA GGG TTA GGG CTG AAA TTT GOT AAC AAA GGG ATG CAG ATT CTT CAT GAA GCA OGC 11614 

3724 EKGLGLKFANKGMQILHEAG 3743 

11615 AAA CCT CAG AAG ATA ACG GAA GGG GAA AAG ATG AAA GTT GCC TAT AGA TTT GAG GAT ATA 11674 

3744 KPQKITEGEKMKVAYRFEDI 3763 

11675 GAG TTC TOT TCT CAT ACC CCA GTC CCT GTT AOG TOG TCC GAC AAC ACC AOT ACT CAC ATG 11734 

3764 EFCSHTPVPVRWSDNTSSHM 3783 

11735 GCC GGG AGA GAC ACC GCT GTG ATA CTA TCA AAG ATG GCA ACA AGA TTG GAT TCA ACT GGA 11794 

3784 AGRDTAVI LSKMATRLDSSG 3803 

11795 GAG AGG OCT ACC ACA GCA TAT GAA AAA GOG CTA GCC TTC AOT TTC TTG CTG ATG TAT TCC 11854 

3804 ERGTTAYEKAVAFSFLLMYS 3823 

11855 TGG AAC COG CTT GTT AGG AGG ATT TGC CTG TTG GTC CTT TCG CAA CAC OCA GAG ACA GAC 11914 

3824 WNPLVRRICLLVLSQQPETD 3843 

11915 CCA TCA AAA CAT OCC ACT TAT TAT TAC AAA GOT GAT CCA ATA GGG GCC TAT AAA GAT OTA 11974 

3844 PSKHATYYYKGDPIGAYKDV 3863 

11975 ATA OCT OGG AAT CTA AOT GAA CTG AAG AGA ACA OGC TTT GAG AAA TTG GCA AAT CTA AAC 12034 

3864 IGRNLSBLKRTGFEKLANLN 3883 

12035 CTA AGC CTG TCC ACG TTG OGG ATC TOG ACT AAG CAC ACA AGC AAA AGA ATA ATT CAG GAC 12094 

3884 LSLSTLGIWTKHTSKRI IQD 3903 
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12095 TGT GTT C3CC ATT QGG AAA GAA GAG GGC AAC TOG CTA GTT AAC GCC GAC AGO CTG ATA TCC 12154 
3904 CVAIGKEEGNWLVNADRLIS 3923 

12155 AGC AAA ACT GGC CAC TTA TAG ATA CCT GAT AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT 12214 
3924 SKTGHLYIPDKGFTLQGKHY 3943 

12215 GAG CAA CTG CAG CTA AGA ACA GAG ACA AAC CCG GTC ATG GOG GTT QGG ACT GAG AGA TAC 12274 
3944 EQLQLRTETNPVMGVGTERY 3963 

12275 AAG TTA GGT CCC ATA GTC AAT CTG CTG CTG AGA AGO TTG AAA ATT CTG CTC ATG ACQ GCC 12334 
3964 KLGPIVNLLLRRLKILLMTA 3983 

12335 GTC OGC GTC AGC AGC TGA gacaaaatgtatatattgtaaataaattaatccatgtacatagtgtatataaacat 12408 
3984 V G V S S • 3989 

12409 agctgggaccgcccacctcaagaagacgacacgcccaacacgcacagctaaacagtagtcaagattacctacctcaagat 12468 

12489 aacactacacctaacgcacacagcactctagccgtatgaggatacgcccgacgtccatagttggactagggaagacctct 12568 

12569 aacagccccccgcaggc caattaaccagcgggaatacgcggggcacgccgcgccccagcacaccgacgacccaACCctca 12648 

12649 cgtctgacagcccatcaccgtcgagcaagacgtttcccgttgaatatggctcataacaccccttgcattactgtttatgt 12728 

12729 aagcagacagtcccactgtccatgatgatatatttttatcttgtgcaatgcaacatcagagattttgagacacgcggctt 12808 

12809 tgtcgaacaaatcgaactctcgccgagctgaaggaccagatcacgcatcttcccgacaacgcagaccgtcccgcggcaaa 12888 

12 889 gcaaaagcccaaaatcaccaactggtccacctacaacaaagctctcatcaaccgtggctcccccactttccggctggatg 12968 

12969 a tggggcgat tcaggcc tggtatgag t cagcaacacc t tc t tcacgaggc agacc c cagcgc tagcggagcgca t ac egg 13048 

13049 cc cac tatg t tggcac tgatgagggtgccagtgaagcgctccatgcggcaggagaaaaaaggccgcaccggcgcgt cage 13128 

13 129 agaatatgtgatacaggatatattccgcctcctcgctcactgactcgctacgctcggtcgttcgactgcggogagcggaa 13208 

13209 acggcttacgaacggggcggagatttcctggaagatgccaggaagatacttaacagggaagcgagagggccgcggcaaag 13288 

13289 ccgtttttccataggctccgcccccctgacaagcatcacgaaatccgacgctcaaatcagtggtggcgaaacccgacagg 13368 

13369 actataaagataccaggcgtttcccctggcggctccctcgtgcgctctcctgttcccgcctttcggtttaccggcgtcat 13448 

13449 tccgccgctatggccgcgtttgcctcattccacgcctgacactcagctccgggtaggcagtccgccccaagccggaccgt 13528 

13529 acgcacgaaccccccgttcagtccgaccgctgcgccctatccggtaaccatcgtcctgagcccaacccggaaagacatgc 13608 

1 3 609 aaaagcaccactggcagcagccac egg taac tgat t tagaggagt cagtcc tgaagtcatgcgccggt caaggccaaacc 13688 

13689 gaaaggacaagc t ccggcgactgcgc tcccccaagccagt caccccgg t tcaaagagc tggtagc tcagagaacc c tcga 13768 

13769 aaaaccgccccgcaaggcgg c cct C ccgt cc ccagagcaagagac cacgcgcagaccaaaacgacc ccaagaagaccatc 1 3 84 8 

13849 ccaccaaggggtccgacgcccagcggaacgoaaacccacgccaagggaccccggccacgagaccaccaaaaaggaccccc 13928 

13929 acccagaccctttcaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacctggtctgacagtcacca 14008 

14009 acgctcaaccagcgaggcacccacctcagcgatccgcccaccccgtccacccacagtCgcccgaccccccgccgcgtaga 14088 

14089 taaccacgatacgggagggcttaccacccggccccagcgctgcaatgataccgcgagacccacgctcaccggctccagat 14168 

14169 ccaccagcaacaaaccagccagccggaagggccgagcgcagaagcggccctgcaactccatccgcccccacccagtccac 142 48 

14249 taaccgtcgccgggaagccagagtaagtagtccgccagtcaatagttcgcgcaacgttgttgccattgctgcaggcatcg 14328 

14 329 tggtgtcacgctcgtcgtttggtatggccccacccagccccggttcccaacgaccaaggcgagccacatgaccccccatg 14408 

14409 ccgcgcaaaaaagcggtcagccccctcggccctccgatcgccgtcagaagcaagccggccgcagcgtcatcactcatggt 14488 

14489 cacggcagcaccgcacaacccccccactgccacgccacccgcaagacgcccccccgcgaccggcgagcacccaaccaagc 14568 

14 569 cactccgagaacagcgtatgcggcgaccgagccgctcc tgcccggcgtcaacacgggataataccgcgccacatagcaga 14648 

14649 actccaaaagtgcccatcattggaaaacgccctccggggcgaaaactcccaaggacctcaccgccgccgagatccagttc 14728 

14729 gatgtaacccacccgtgcacccaactgacccccagcatcccccacccccaccagcgtccccgggcgagcaaaaacaggaa 14808 

14809 ggcaaaaCgccgcaaaaaagggaaCaagggcgacacggaaacgccgaacacCcacacccccccccccccMCatcaccga 1 4688 

14889 agcatctatcagggctattgccccatgagcggacacatatttgaatgtacccagaaaaacaaacaaataggggttccgcg 14968 

14969 cacacccccccgaaaagcgccacccgacgccgacccgaggcaaccacaacccgggccctacatatggacccaatcccaga 15048 

15049 caacacgactcaccaca 15065 
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DNA sequence 12578 b.p. gtatacgagaac . . . ccaacagccccc Linear 

1 gtatacgagaatcagaaaaggcacccgtatacgcattgggcaactaaaaacaacaactaggcccagggeiacaaacccctc 80 

81 tcagcgaaggccgaaaagaggccagccacgcccc cagtaggaccagcacaacgaggggggtagcaacagcgg tgagcccg 1 60 

161 ttggacggcctaagccctgagtacagggtagtcgtcagcggttcgacgccttggaacaaaggtctcgagatgccacgtgg 240 

24 1 acgagggcBCgcccaaagcacatcccaacccgagcgggggccgcccaggtaaaagcagc 1 1 taaccgaccgc tacrgaa ta 3 20 

321 cagcccgatagggcgccgcagaggcccactgcatcgccaccaaaaacccctgccgcacatggcac ATG GAG TTG 394 
1 M E L 3 

395 ATC ACA AAT GAA CTT TTA TAC AAA ACA TAC AAA CAA AAA CCC GTC GGG GTG GAG GAA CCT 454 
4ITNELLYKTYKQKPVGVEEP 23 

455 GTT TAT GAT CAG GCA GGT GAT CCC TTA TTT GGT GAA AQG GGA GCA GTC CAC CCT CAA TCG 514 
24VYDQAGDPLFGERGAVHPQS 43 

515 ACG CTA AAG CTC CCA CAC AAG AGA GOG GAA CQC GAT GTT CCA ACC AAC TTG GCA TCC TTA 574 
44TLKLPHKRGERDVPTNLASL 63 

575 CCA AAA AGA GGT GAC TGC AGG TCG GGT AAT AGC AGA GGA CCT GTG AGC GGG ATC TAC CTG 634 
64PKRGDCRSGNSRGPVSGIYL 83 

635 AAG OCA GGG CCA CTA TTT TAC CAG GAC TAT AAA GGT CCC GTC TAT CAC AGG GCC CCG CTG 694 
84KPQPLFYQDYKGPVYHRAPL 103 

695 GAG CIC TTT GAG GAGGGATOCATGTGTGAAAOGACTAAACGG ATA GGG AGA OTA ACT QGA 754 
104ELPEEGSMCBTTKRIGRVTG 123 

755 ACT GAC GGA AAO CTG TAC CAC ATT TAT GTG TGT ATA GAT GGA TGT ATA ATA ATA AAA AGT 814 
124SD6KLYHIYVCIDOCIIIKS 143 

815 GCC AOG AGA ACT TAC CAA AGG GTG TTC AGG TGG GTC CAT AAT AGG CTT GAC TGC CCT CTA 874 
144ATRSYQRVFRWVHNRLDCPL 163 

875 TOG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CAG AAA 934 I 
164WVTTCSDTKEEGATKKKTQK 183 ^ 

935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GAC AGC 994 ^ 
184PDRLERGKMKIVPKESEKDS 203 H 

995 AAA ACT AAA CCT COG GAT OCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG GTG AGG AAG 1054 D 
204 KTKPPDATIVVEGVKYQVRK 223 ^ 

NN 

1055 AAG GGA AAA ACC AAO AGT AAA AAC ACT CAG GAC GGC TTG TAC CAT AAC AAA AAC AAA CCT 1114 ^ 
224 KGKTKSKNTQDGLYHNKNKP 243 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GCG TGG GCA ATA ATA GOT ATA GTT 1174 
244 QES.RKKLEKALLAWA I I A I V 263 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TGG AAC CTA CAA GAT AAT GGG ACG 1234 
264 LFQVTHGENITQWNLQONGT 283 

1235 GAA GGG ATA CAA GGG GCA ATG TTC CAA AGG GGT GTG AAT AGA AGT TTA CAT GGA ATC TOG 1294 
284 EGIQRAMFQRGVNRSLHGXW 303 

1295 CCA GAG AAA ATC TffP ACT GOT GTC CCT TCC CAT CTA GCC ACC GAT ATA GAA CTA AAA ACA 1354 
304 PEKICTGVPSHLATDIELKT 323 

1355 ATT CAT GOT ATG ATG GAT GCA ACT GAG AAG ACC AAC TAC AOG TGT TGC AGA CTT CAA CGC 1414 
324 1HGMMDASEKTNYTCCRLQR 343 

1415 CAT GAG TOG AAC AAG CAT GGT TGG TGC AAC TGG TAC AAT ATT GAA CCC TGG ATT CTA GTC 1474 
344 HEWNKHGWCNWYNIEPWILV 363 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GTC ACT 1534 
364 MNRTOANLTEGQPPRECAVT 383 

1535 TGT AOG TAT GAT AGG OCT AGT GAC TTA AAC GTG GTA ACA CAA OCT AGA GAT AGC CCC ACA 1594 
384 CRYDRASDLNVVTQARDSPT 403 

1595 CCC TTA ACA GGT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATG CGG GGC 1654 
404 PL TGCKKGKNFSFAGILMRG 423 

1655 CCC TGC AAC TTT GAA ATA GCT GCA AGT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT AGT 1714 
424 PCNFEIAASDVLFKEHERIS 443 

1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA GGT GCC 1774 
444 MFQDTTLY LVDGLTNSLEGA 463 
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1775 AGA CAA QCy^ ACC OCT AAA CTG ACA ACC TOG TTA GGC AAG CAG CrC OGG ATA CTA GGA AAA 1834 

464 RQGTAKLTTWLGKQLGILGK 483 

1835 AAGTTGGAAAACAAGAGTAAGACXSTQGTTTGGAGCATACGCT OCT TCC OCT TAC TGT GAT 1894 

484 KLENKSKTWFGAYAASPYCD 503 

1895 GTC GAT CGC AAA ATT GGC TAC ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TGC TTA CCC 1954 

504 VDRKIGYIWYTKNCTPACLP 523 

1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT GAC ACC AAT CCA GAG GAC GGC AAG ATA 2014 

524 KNTKIVGPGKFDTNAEDGKI 543 

2015 TTA CAT GAG ATG OGG QGT CAC TTG TOG GAG GTA CTA CTA CTT TCT TTA GTG GTG CTG TCC 2074 

544 LHEMGGHLSEVLLLSLVVLS 563 

2075 GAC TTC GCA CCG GAA ACA GCT AC?r GTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564 DFAPETASVMYLI LHFS I PQ 583 

2135 AGT CAC GTT GAT GTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTG GAG CTG ACA 2194 

5B4SHVDVMDCDKTQLNLTVELT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TOG GTC TGG AAT CTA GGC AAA TAT GTA TGT ATA AGA CCA 2254 

604 TAEVIPGSVWNLGKYVCIRP 623 

2255 AAT TOG TGG CCT TAT GAG ACA ACT GTA GTG TTG GCA TVT GAA GAG GTG AGC CAG GTG GTG 2314 

624 NWWPYETTVVLAFEEVSQVV 643 

2315 AAO TTA GTG TTG AOG OCA CTC AGA GAT TTA ACA CGC ATT TGG AAC OCT GCA ACA ACT ACT 2374 

644 KLVLRALRDLTRIWNAATTT 663 

2375 GCT TTT TTA GTA TGC CTT GTT AAG ATA GTC AGG GGC CAG ATG CTA CAG GGC ATT CTG TOG 2434 

664 AFLVCLVKIVRGQMVQGILW 683 

2435 CTA CTA TTG ATA ACA GOG GTA CAA GGG CAC TTG GAT TGC AAA CCT GAA TTC TCG TAT GCC 2494 

684 LLLITGVQGHLDCKPEFSYA 703 

2495 ATA GCA AAG GAC GAA AGA ATT GGT CAA CTG GGG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 

704 lAKDERIGOLGAEGLTTTWK 723 



2555 GAA TAC TCA CCT OGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC GAA GAT GGG 2614 

724 EYSPGMKLEDTMVIAWCEDG 743 

2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AOG TAT CTC GCA ATC TTG CAT ACA 2674 

744 KLMYLQRCTRETRYLAILHT 763 

2675 AGA OCC TTG CCG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG CGA AAG CAA GAG GAT 2734 

764 RALPTSVVFKKLPDGRKQED 783 

2735 GTA GIC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 

764 VVEHNDNFEFGLCPCDAKPI 803 

2795 OTA AGA OGG AAG TTC AAT ACA ACG CTG CTG AAC GGA CCG GCC TTC CAG ATG GTA TGC CCC 2854 

804 VRGKFNTTLLMGPAFOMVCP 823 

2855 ATA GGA TGG ACA GGG ACT OTA AGC TCT AOG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 2914 

824 I GWTGTVSCTS FNMDT L ATT 843 

2915 GTG GTA CGG ACA TAT AGA AGG TCT AAA CCA TTC CCT CAT AGG CAA GGC TGT ATC ACC CAA 2974 

844 VVRTYRRSKPFPHRQGCITQ 863 

2975 AAG AAT CTG GGG OAO GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TGT GTG CCT 3034 

864 KNLGEDLHNCILGGNWTCVP 883 

3035 OGA GAC CAA CTA CTA TAC AAA OGG GGC TCT ATT GAA TCT TGC AAG TGG TC?r GGC TAT CAA 3094 

884 GDQLLYKOGSIESCKWCGYQ 903 

3095 TTT AAA GAG AGT GAG OGA CTA CCA CAC TAC CCC ATT GGC AAG TGT AAA TTG GAG AAC GAG 3154 

904 FKESEGLPHYPIGKCKLENE 923 

3155 ACT GGT TAC AGG CTA GTA GAC AGT ACC TCT TGC AAT AGA GAA GGT GTG GCC ATA GTA CCA 3214 

924 TGYRLVDSTSCNREGVAIVP 943 

3215 CAA OOG ACA TTA AAG TOC AAG ATA GGA AAA ACA ACT GTA CAG GTC ATA GCT ATG GAT ACC 3274 

944 QGTLKCKIGKTTVQVIAMDT 963 

3275 AAA CTC OGA OCT ATG OCT TGC AGA CCA TAT GAA ATC ATA TCA AGT GAG OGG CCT OTA GAA 3334 

964 KLOPMPCRPYEIISSEGPVE 983 

3335 AAG ACA OCG TGT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 

984 KTACTFNYTKTLKNKY FEPR 1003 

3395 GAC AGC TAC TTT CAG CAA TAC ATG CTA AAA GGA GAG TAT CAA TAC TGG TTT GAC CTG GAG 3454 

1004 DSYFQQYMLKGEYQYWFDLE 1023 

3455 GTG ACT GAC CAT CAC CGG GAT TAC TTC GCT GAG TCC ATA TTA GTG GTG GTA GTA GCC CTC 3514 

1024 VTDHHRDYFAESILVVVVAL 1043 
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3515 TTG OGT GGC AGA TAT GTA CTT TOG TTA CTG GTT ACA TAC ATG GTC TTA TCA GAA CAG AAG 3574 

1044 LGGRYVLWLLVTYMVLSEQK 1063 

3575 GCC TTA GGG ATT CAG TAT GGA TCA GOG GAA GTG GTG ATG ATG GGC AAC TTG CTA ACC CAT 3634 

1064 ALGIOYGSGEVVMMGNLLTH 1083 

3635 AAC AAT ATT GAA GTG GTG ACA TAC TTC TTG CTG CTG TAC CTA CTG CTG AGG GAG GAG AGC 3694 

1084 NNIEVVTYFLLLYLLLREES 1103 

3695 GTA AAG AAG TGG GTC TTA CTC TTA TAC CAC ATC TTA GTG GTA CAC CCA ATC AAA TCT GTA 3754 

1104 VKKWVLLLYHILVVHPIKSV 1123 

3755 ATT GTG ATC CTA CTG ATG ATT GGG GAT GTG GTA AAG GCC GAT TCA GGG OGC CAA GAG TAC 3814 

1124 IVI LLMIGDVVKADSGGQEY 1143 

3815 rrc GGG AAA ATA GAC CTC TGT TTT ACA ACA GTA GTA CTA ATC GTC ATA GGT TTA ATC ATA 3874 

1144 LGKIDLCFTTVVLIVIGLI I 1163 

3875 GCC AGG CGT GAC CCA ACT ATA GTG CCA CTG GTA ACA ATA ATG GCA GCA CTG AGG GTC ACT 3934 

1164 ARRDPTIVPLVTIMAALRVT 1183 

3935 GAA CTG ACC CAC CAG CCT GGA GTT GAC ATC GCT GTG GCG GTC ATG ACT ATA ACC CTA CTG 3994 

1184 ELTHQPGVDI AVAVMTITLL 1203 

3995 ATG CTT AOC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA TGG TTA CAO TGC ATT CTC AOC 4054 

1204 MVSYVTDYFRYKKWLQCI LS X223 

4055 CTG GTA TCT QCS GTG TTC TTG ATA AGA AGC CTA ATA TAC CTA GGT AGA ATC GAG ATG CCA 4114 

1224 LVSAVFLIRSLIYLGRIEMP 1243 

4115 GAG GTA ACT ATC OCA AAC TGG AGA OCA CTA ACT TTA ATA CTA TTA TAT TTG ATC TCA ACA 4174 

1244 EVTIPNWRPLTUILLYLIST 1263 

4175 ACA ATT GTA ACG AGG TGG AAG GTIT GAC GTG GCT GGC CTA TTG TTG CAA TGT GTG CCT ATC 4234 

1264 TIVTRWKVDVAGLLLQCVPI 1283 

4235 TTA TTG CTG GTC ACA ACC TTG TGG GCC GAC TTC TTA ACC CTA ATA CTG ATC CTG CCT ACC 4294 

1284 LLLVTTLWADFLTLILILPT 1303 



7 



4295 TAT GAA TTG GTT AAA TTA TAC TAT CTG AAA ACT GTT AOG ACT GAT ATA GAA AGA AGT TGG 4354 

1304 YELVKLYYLKTVRTDI ERSW 1323 

4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT GAT GAG AGT GGA GAG 4414 

1324 LGGIDYTRVDSIYDVDESGE 1343 

4415 GGC GTA TAT CTT TTT CCA TCA AGG CAG AAA GCA CAG GGG AAT TTT TCT ATA CTC TTG CCC 4474 H 

1344 GVYLFPSRQKAQGNFSI LLP 1363 g 

4475 CTT ATC AAA GCA ACA CTG ATA AGT TGC GTC AGC AGT AAA TOG CAG CTA ATA TAC ATG AGT 4534 

1364 LIKATLXSCVSSKWQLIYMS 1383 

4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AOG AAA GTT ATA GAA GAG ATC TCA OQA 4594 

1384 YLTLDFMYYMHRKVIEEI SO 1403 

4595 GOT ACC AAC ATA ATA TCC AGG TTA OTG OCA GCA CTC ATA GAG CTG AAC TGG TCC ATG GAA 4654 

1404 GTNIISRLVAALIELNWSME 1423 

4655 GAA GAG GAG AGC AAA OGC TTA AAG AAG TTT TAT CTA TTG TCT GGA AGO TPQ AGA AAC CTA 4714 

1424 EEESKGLKKFYLLSGRLRML 1443 

4715 ATA ATA AAA CAT AAG GTA AOG AAT GAG ACC GTG GCT TCT TGG TAC GGG GAG GAG GAA CTC 4774 

1444 I IKHKVRNETVASWYGEEEV 1463 

4775 TAC OGT ATG CCA AAG ATC ATO ACT ATA ATC AAO GCC ACTT ACA CTG AGT AAG AGC AOG CAC 4834 

1464 YGMPK IMTIIKASTLSKSRH 1483 

4835 TGC ATA ATA TGC ACT GTA TGT GAG GGC CGA GAO TOG AAA GGT 000 ACC TGC CCA AAA TGT 4894 

1484 criCTVCEGREWKGGTCPKC 1503 

4895 GGA OGC CAT GGG AAG CCG ATA ACG TOT GGG ATG TOG CTA GCA GAT TTT GAA GAA AGA CAC 4954 

1504 GRHGKPITCGMSLADFEERH 1523 

4955 TAT AAA AGA ATC TTT ATA AGG GAA GGC AAC TTT GAG OGT ATG TGC AGC CGA TGC CAO GGA 5014 

1524 YKRIFIREGNFEGMCSRCQG 1543 

5015 AAG CAT AOG AGG TTT GAA ATG GAC COG GAA CCT AAG AGT GCC AGA TAC TGT OCT GAG TGT 5074 

1544 KHRRFEMDREPKSARYCAEC 1563 

5075 AAT AOG CTG CAT OCT GCT GAG GAA OGT GAC TTT TOG OCA GM3 TOG AOC ATO TTG OGC CTC 5134 

1564 NRLHPAEEGDFWAESSMLGL 1583 

5135 AAA ATC ACC TAC TTT OCG CTC ATG GAT GGA AAG GTG TAT GAT ATC ACA GAG TOG GCT GGA 5194 

1584 KITYFALMDGKVYDITEWAG 1603 

5195 TGC CAG OGT GTG GGA ATC TCC CCA GAT AOC CAC AGA GTC CCT TGT CAC ATC TCA TTP GGT 5254 

1604 CQRVGISPDTHRVPCHISFG 1623 
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5255 TCA COG ATG CCT TTC AOG CAG GAA TAC AAT GGC TTT GTA CAA TAT ACC OCT AGG GGG CAA 5314 

1624 SRMPFRQEYNGFVQYTARGQ 1643 

5315 CTA TTT CTG AGA AAC TTG CCC GTA CTG GCA ACT AAA GTA AAA ATG CTC ATG GTA GGC AAC 5374 

1644 LFLRNLPVLATKVKHLHVGN 1663 

5375 CTT GGA GAA GAA ATT GGT AAT CTG GAA CAT CTT GGG TGG ATC CTA AGG GOG CCT GCC GTG 5434 

1664 LGEEIGNLEHLGWILRGPAV 1683 

5435 TGT AAG AAG ATC ACA GAG CAC GAA AAA TGC CAC ATT AAT ATA CTG GAT AAA CTA ACC OCA 5494 

1684 CKKITEHEKCHINILDKLTA 1703 

5495 TTT TTC GGG ATC ATG CCA AGG GOG ACT ACA CCC AGA GCC CCG GTG AGG TTC CCT ACG AGC 5554 

1704 FFGI MPRGTTPRAPVRFPTS 1723 

5555 TTA CTA AAA GTG AGG AGG GGT CTG GAG ACT GCC TOG OCT TAC ACA CAC CAA GGC GGG ATA 5614 

1724 LLKVRRGLETAWAYTHQGG I 1743 

5615 ACT TCA GTC GAC CAT GTA ACC GCC GGA AAA GAT CTA CTG GTC TGT GAC AGC ATG GGA CGA 5674 

1744 SSVDHVTAGKDLLVCDSMGR 1763 

5675 ACT AGA GTC GTT TGC CAA AGC AAC AAC AGG TTG ACC GAT GAG ACA GAG TAT GGC GTC AAG 5734 

1764 TRVVCQSNNRLTDETEYGVK 1783 

5735 ACT GAC TCA GGG TGC CCA GAC GGT GCC AGA TCTT TAT GTG TTA AAT CCA GAG GCC GTT AAC 5794 

1784 TDSGCPDGARCYVLNPBAVN 1803 

5795 ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC CTC CAA AAO ACA GGT GGA GAA TTC ACG TGT 5854 

1804 ZSGSKGAVVKLQKTGGEFTC 1823 

5855 GTC ACC OCA TCA GGC ACA CCG OCT TTC TTC GAC CTA AAA AAC TTG AAA GGA TGG TCA OGC 5914 

1824 VTASGTPAFFDLKNLKGWSG 1843 

5915 TTC CCT ATA TTT GAA GCC TGC AGC GGG AGG GrXQ OTT GOC AGA CRC AAA GTA GOG AAO AAT 5974 

1844 LPI FEASSGRVVGRVKVGKN 1863 

5975 GAA GAG TCT AAA CCT ACA AAA ATA ATG AGT GGA ATC CAG ACC GTC TCA AAA AAC AGA GCA 6034 

1864 EESKPTKIMSGIQTVSKNRA 1883 ^ 

6035 GAC CTC ACC GAG ATO GTC AAG AAO ATA ACC AGC ATC AAC AOG GGA GAC TTC AAG CAG ATT 6094 i— < 

1884 DLTEMVKKITSMNRGDFKQI 1903 

6095 ACT TTO GCA ACA GGG GCA OGC AAA ACC ACA GAA CTC CCA AAA GCA CrTT ATA GAG GAG ATA 6154 S 

1904 TLATGAGKTTE LPKAVIEEI 1923 K 

6155 GGA AGA CAC AAG AGA CTTA TTA GTT CTT ATA CCA TTA AOG GCA GCG GCA GAG TCA GTC TAC 6214 O 

1924 GRHKRVLVLIPLRAAAESVY 1943 



6215 CAG TAT ATC AGA TTC AAA CAC CCA AGC ATC TCT TTT AAC CTA AGG ATA GGG GAC ATC AAA 6274 

1944 QYMRLKHPSISFNLRIGDMK 1963 

6275 GAG GGG GAC ATC GCA ACC GGG ATA ACC TAT GCA TCA TAC GGG TAC TTC TGC CAA ATC CCT 6334 

1964 EGDMATGITYASYGYFCQMP 1983 

6335 CAA CCA AAG CTC AGA OCT OCT ATC GTA GAA TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT 6394 

1984 QPKLRAAMVEYSYIFLDEYH 2003 

6395 TGTOCCACTOCTaAACAACTCGCAATrATCQOQAAGATCCACAGATITTCAGAGAQTATA 6454 

2004 CATPEQLAIIGKIHRFSESI 2023 

6455 AOG Gnr GTC GCC ATC ACT GCC AOG CCA GCA GOG TCG GTC ACC ACA ACA GGT CAA AAG CAC 6514 

2024 RVVAMTATPAGSVTTTGQKH 2043 

6515 CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA ATC AAA GOG GAG GAT CTT OGT AGT CAG TTC 6574 

2044 PIEEFIAPEVMK6 EDLGSQF 2063 

6575 CTT GAT ATA OCA GGG TTA AAA ATA CCA GTO GAT GAG ATO AAA GGC AAT ATC TTO GTT TTT 6634 

2064 LDI AGLKI PVDEMKGNMLVF 2083 

6635 GTA CCA ACG AGA AAC ATC GCA GTA GAG GTA GCA AAG AAG CTA AAA OCT AAG OGC TAT AAC 6694 

2084 VPTRNMAVEVAKKLKAKGYN 2103 

6695 TCT GGA TAC TAT TAC AGT GGA GAG GAT CCA GCC AAT CTC AGA GTT GTO ACA TCA CAA TCC 6754 

2104 SGYYYSGEDPANLRVVTSQS 2123 

6755 CCC TAT GTA ATC CTG GCT ACA AAT OCT ATT GAA TCA GGA GTC ACA CTA CCA GAT TTC GAC 6814 

2124 PYVIVATNAI ESGVTLPDLD 2143 

6815 ACQ GTT ATA GAC ACG GGG TTO AAA TGT GAA AAG AGG GTO AGG GTA TCA TCA AAG ATA CCC 6674 

2144 TVI DTGLKCEKRVRVSSKI P 2163 

6875 TTC ATC GTA ACA OGC CTT AAG AOG ATC GOC GTC ACT GTO GOT GAG CAG OCG CAG GGT AOG 6934 

2164 FIVTGLKRMAVTVCEQAQRR 2183 

6935 OGC ABA GTA OGT AGA GTC AAA CCC GGG AOG TAT TAT AGO AGC CAG GAA ACA OCA ACA OOG 6994 

2184 GRVORVKPORYYRSQETATG 2203 
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6995 TCA AAG GAC TAG CAC TAT GAC CTC TTG CAC GCA CAA AGA TAG GGG ATI* GAG GAT OGA ATG 7054 

2204 SKDYHYOLLOAQRYGI EDG I 2223 

70SS AAC era ACG AAA TCC TTT AGG GAG ATG AAT TAG GAT TGG AGC CTA TAG GAG GAG GAG AGC 7114 

2224 NVTKSFREMNYDWSLY E EDS 2243 

7115 CTA CTA ATA ACC GAG GTG GAA ATA CTA AAT AAT CTA GTG ATC TGA GAA GAC TTG CCA GCC 7174 

2244 LLITQLEILNNLLISEDLPA 2263 

7175 OCT GTT AAG AAC ATA ATG GCC AGG ACT GAT CAC CCA GAG CCA ATC CAA CTT GCA TAG AAC 7234 

2264 AVKNIMARTDHPEPIQLAYN 2283 

7235 AGC TAT GAA GTG GAG GTC CCG GTC CTG TTC CCA AAA ATA AGG AAT GGA GAA GTG AGA GAC 7294 

2284 SYEVQVPVLFPKIRNGEVTD 2303 

7295 ACG TAG GAA AAT TAC TCG TTT CTA AAT GCC AGA AAG TTA GGG GAG GAT GTG GCC GTG TAT 7354 

2304 TYEIIYSFLNARKLGEDVPVY 2323 

7355 ATC TAC GCT ACT GAA GAT GAG GAT CTG GCA GTT GAC CTC TTA GGG CTA GAC TGG CCT GAT 7414 

2324 lYATEDEDLAVDLLGLDWPD 2343 

7415 CCT GGG AAC GAG CAG GTA GTG GAG ACT GGT AAA GCA CTG AAG CAA GTG ACC GGG TTG TCC 7474 

2344 PGNQQVVETGKALKQVTGLS 2363 

7475 TCG GCT GAA AAT GCC CTA CTA GTG GCT TTA TTT GGG TAT GTQ GGT TAC CAG GCT CTC TCA 7534 

2364 SAENALLVALPGYVGYQALS 2383 

7535 AAG AGG CAT GTTC CCA ATG ATA AGA GAC ATA TAT ACC ATC GAC GAC CAG AGA CTA GAA GAC 7594 

2384 KRHVPHITDIYTIEDQRLED 2403 

7595 ACC ACC CAC CTC CAG TAT OCA OCC AAC GCC ATA AAA ACC GAT OOG ACA GAG ACT GAA CTG 7654 

2404 TTHLOYAPNAIKTOGTETEL 2423 

7655 AAA GAA CTG GGG TCG GGT GAC GTG GAA AAA ATC ATG GGA GCC ATT TCA GAT TAT GCA OCT 7714 

2424 KELASGDVEKIMGAISDYAA 2443 

7715 GGG OGA CTG GAG TTT GTT AAA TCC CAA GCA GAA AAG ATA AAA ACA GCT CCT TTG TTT AAA 7774 

2444 GGLEFVKSQAEKIKTAPLFK 2463 

in 

7775 GAA AAC GCA GAA GCC OCA AAA OGG TAT GTC CAA AAA TTC ATT GAC TGA TTA ATT GAA AAT 7834 » 

2464 ENAEAAKGYVQKFXOSLI EN 2483 ^ 



7835 AAA GAA GAA ATA ATC AGA TAT GGT TTG TGG GGA ACA CAC ACA OCA CTA TAG AAA AGC ATA 7894 

2484 KEEIIRYGLWGTHTALYK S I 2503 

7895 GCT GCA AGA CTG GOG GAT GAA ACA GCC TTT GCC ACA CTA GTG TTA AAG TGG CTA OCT TTT 7954 

2504 AARLGHETAFATLVLKWLAF 2523 

7955 GGA OGG GAA TCA GTG TCA GAC CAC GTC AAG CAG GGG GCA GTT GAT TTA GTG GrPC TAT TAT 8014 

2524 GGESVSDHVKQAAVDLVVYY 2543 

8015 GTG ATG AAT AAG CCT TCC TTC CCA GGT GAC TCC GAG ACA CAG CAA GAA GGG AGG GGA TTC 8074 

2544 VMNKPSFPGDSETQQEGRRF 2563 

8075 GTC OCA AGC CTG TTC ATC TCC GCA CTG GCA ACC TAC ACA TAC AAA ACT TGG AAT TAC CAC 8134 

2564 VASLFISALATYTYKTWNYH 2583 

8135 AAT CTC TCT AAA GIO G?K5 GAA CCA GCC CTG OCT TAC CTC OOC TAT OCT ACC AGC GCA TTA 8194 

2584 NLSKVVEPALAYLPYATSAL 2603 

8195 AAA ATG TTC ACC CCA ACG COG CTG GAQ AGC GTG GTG ATA CTG AGC ACC ACQ ATA TAT AAA 8254 

2604 KMFTPTR LESVVILSTTI YK 2623 

8255 ACA TAC CTC TCT ATA AGO AAG GOG AAG AGT GAT GGA TTG CTG GOT ACG GGG ATA AGT GCA 8314 

2624 TYLSIRKGKS.DGLLGTGI SA 2643 

8315 OCC ATG GAA ATC CTG TCA CAA AAC CCA OTA TOG GTA GGT ATA TCT GTG ATG TTG GGG GTA 8374 

2644 AMEILSQNPVSVGI SVMLOV 2663 

8375 GGG GCA ATC GCT OCQ CAC AAC GCT ATT GAG TOG AGT GAA CAG AAA AGG ACC CTA CTT ATG 8434 

2664 GAIAAHNAIESSEQKRTLLM 2663 

8435 AAG GTG TTT GTA AAG AAC TTC TTG GAT CAG GCT GCA ACA GAT GAG CTG CTA AAA GAA AAC 8494 

2684 KVFVKNFLDQAATDELVKEN 2703 

8495 CCA GAA AAA ATT ATA ATG GCC TTA TTT GAA GCA GTC CAG ACA ATT GCT AAC GCC CTG AGA 8554 

2704 PEKIIMALFEAVQTIGNPLR 2723 

6555 CTA ATA TAC CAC CTG TAT OGG GTT TAC TAC AAA GOT TGG GAG GCC AAG GAA CTA TCT GAG 8614 

2724 LIYHLYGVYYKGWEAKELSE 2743 

6615 AOG ACA OCA OGC AGA AAC TTA TTC ACA TTG ATA ATG TTT GAA OCC TTC GAG TTA TTA OGG 8674 

2744 RTAGRNLFTLIMFEAFELLG 2763 

8675 ATG GAC TCA GAA GGG AAA ATA AOG AAC CTG TCC GGA AAT TAC ATT TTG GAT TTG ATA TAC 8734 

2764 HDSQ GKIRNLSGNY ILDLIY 2783 



P 
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8735 GGC CTA CAC AAG CAA ATC AAC AGA GGG CTG AAG AAA ATG Gl'A CTC CX5G TOG GCC CCT GCA 8794 

2784 GLHKOINRGLKKMVLGWAPA 2803 

8795 CCCTTTAGTTGTGACTGGACrCCTAOTGACCAGAOGATCAGATTGa^ 8854 

2804 PFSCDWTPSDERIRLPTDNY 2823 

8855 TTG AOG CTA GAA ACC AGG TGC CCA TGT GGC TAT GAG ATG AAA GCT TTC AAA AAT CTA GG^ 8914 

2824 LRVETRCPCGYEMKAFKNVG 2843 



8915 GGCAAACTTACCAAAGTGGAGGAGAGCGGGCCTTTC CTA TGT AGA AAC AGA CCT QGT AGO 
2844 GKLTKVEESGPFLCRNRPGR 



8974 
2863 



8975 GGA CCA GTC AAC TAC AGA GTC ACC AAG TAT TAG GAT GAC AAC CTC AGA GAG ATA AAA CCA 9034 

2864 GPVNYRVTKYYODNLREIKP 2883 

9035 GTA GCA AAG TTG GAA GGA CAG GTA GAG CAC TAC TAC AAA GGG CTC ACA GCA AAA ATT GAC 9094 

2884 VAKLEGQVEHYYKGVTAKID 2903 



9095 TAC ACT AAA GGA AAA ATG CTC TTG GCC ACT GAC AAG TGG GAG GTG GAA CAT OCT GTC ATA 
2904 YSKGKMLLATDKWEVEHGVI 

9155 ACC AGO TTA GCT AAG AGA TAT ACT GGG GTC GGG TTC AAT GCT GCA TAC TTA OCT GAC GAG 
2924 TRLAKRYTGVGFNGAY LGDE 

9215 CCC AAT CAC CXTT GCT CTA GTG GAG AOG GAC TOT GCA ACT ATA AOC AAA AAC ACA OTA CAG 
2944 PNHRALVERDCATITKNTVQ 

9275 TIT CTA AAA ATG AAG AAG GOG TOT GGG TTC AOC TAT GAC CTG ACC ATC TCC AAT CTC ACC 
2964 FLKHKKGCAFTYDLTI SNLT 

9335 AGG CTC ATC GAA CTA GTA CAC AOG AAC AAT CTT GAA GAG AAG GAA ATA CCC ACC GCT ACG 
2984 RLIELVHRNNLEEKEI PTAT 

9395 GTC ACC ACA TOG CTA GCT TAC ACC TTC GTG AAT GAA GAC CTA GGG ACT ATA AAA CCA CTA 
3004 VTTWLAYTFVNEDVGT I KPV 

9455 CTA GGA GAG AGA CTA ATC CCC GAC CCT CTA GTT GAT ATC AAT TTA CAA CCA GAG GTG CAA 
3024 LGERVI PDPVVDINLQPEVQ 

9515 aiX3 GAC ACQ TCA GAG GOT QGQ ATC ACA ATA ATT OGA AGG GAA ACC CTG ATG ACA ACQ GGA 
3044 VDTSEVGITIIGRETLMTTG 

9575 GTG ACA CCT GTC TTG GAA AAA CTA GAG CCT GAC GCC AOC GAC AAC CAA AAC TCG GTC AAG 
3064 VTPVLEKVEPDASDNQNSVK 

9635 ATC GOG TTG GAT GAG GCT AAT TAC CCA GOG CCT GGA ATA CAG ACA CAT ACA CTA ACA GAA 
3084 IGLDEGNYPGPGIQTHTLTE 

9695 GAA ATA CAC AAC AGG GAT GCG AGG CCC TTC ATC ATO ATC CTO GGC TCA AGG AAT TCC ATA 
3104 EIHNRDARPFIMILGSRNSI 

9755 TCA AAT AGG GCA AAG ACT GCT AGA AAT ATA AAT CTG TAC ACA GGA AAT GAC CCC AGG GAA 
3124 SNRAKTARNINLYTGND PRE 

9815 ATA OGA GAC TTG ATO GCT GCA GGG CGC ATG TTA CTA OTA GCA CTG AGG GAT GTC GAC CCT 
3144 IRDLMAAGRMLVVALRDVDP 

9875 GAG CTG TCT GAA ATG GTC GAT TTC AAG GGG ACT TTT TTA GAT AGG GAG GCC CTG GAG GCT 
3164 ELSEMVDFKGTFLDREALEA 



9154 
2923 



9214 
2943 



9274 
2963 



9334 
2983 



9394 
3003 



9454 
3023 



9514 
3043 



9574 

3063 



9634 
3083 



9694 
3103 



9754 
3123 



9814 
3143 



9874 
3163 



9934 
3183 



9935 CTA ACT CTC 03G CAA CCT AAA CCG AAG CAG GTT AOC AAG GAA GCT GTT AGG AAT TTG ATA 9994 

3184 LSLGQPKPKQVTKEAVRNLI 3203 

9995 GAA CAG AAA AAA GAT CTC GAG ATC CCT AAC TGG TTT GCA TCA GAT GAC CCA CTA TIT CTG 10054 

3204 BQKKDVEIPNWFASDDPVFL 3223 

10055 GAA GTG GCC TTA AAA AAT GAT AAG TAC TAC TTA CTA GGA GAT GTT GGA GAG CTA AAA GAT 10114 

3224 EVALKNDKYYLVGDVG E LKD 3243 

10115 CAA GCT AAA GCA CTT GGG GCC ACG GAT CAG ACA AGA ATT ATA AAG GAG CTA GGC TCA AGG 10174 

3244 QAKALGATDQTRIIKEVGSR 3263 

10175 ACG TAT GCC ATG AAG CTA TCT AGC TOG TTC CTC AAG GCA TCA AAC AAA CAG ATG ACT TTA 10234 

3264 TYAMKLSSWFLKASNKQMSL 3283 

10235 ACTCCACTGTITGAGGAATTGTTGCTACGGTGCCCAOCTGCAACTAAGAGCAATAAGGGG 10294 

3284 TPLFEELLLRCPPATKSNKG 3303 

10295 CAC ATG OCA TCA GCT TAC CAA TTC GCA CAG GCT AAC TGG GAG CCC CTC GCT TCC GGG GTC 10354 

3304 HMAS AYOLAQGNWE P LG CGV 3323 

10355 CAC CTA OCT ACA ATA CCA GOC AGA AGO GTC AAG ATA CAC CCA TAT GAA OCT TAC CTG AAG 10414 

3324 HLGTI PARRVKZHPYEAYLK 3343 

10415 TTC AAA GAT TTC ATA GAA GAA GAA GAG AAO AAA CCT AGG GTT AAG GAT ACA CTA ATA AGA 10474 

3344 LKDFIEEEEKKPRVKDTVIR 3363 
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10475 GAG CAC AAC AAA TOG ATA CTT AAA AAA ATA AGG TTT CAA GGA AAC CTC AAC ACC AAG AAA 10534 
3364 EHNKWILKKI RFQGNLNTKK 3383 

10535 ATG CTC AAC CCG GOG AAA CTA TCT GAA CAG TTG GAC AGG GAG GOG CGC AAG AGO AAC ATC 10594 
3384 NLNPGKLSCQLDREGRKRNI 3403 

10595 TAC AAC CAC CAG ATT GOT ACT ATA ATG TCA ACT GCA GGC ATA AGG CTG GAG AAA TTG CCA 10654 
3404 YNHQIGTIMSSAGIRLEKLP 3423 

10655 ATA GTG AGG GCC CAA ACC GAC ACC AAA ACC TIT CAT GAG GCA ATA AGA GAT AAG ATA GAC 10714 
3424 IVRAQTDTKTFKEAIRDKID 3443 

10715 AAG AGT GAA AAC CGG CAA AAT CCA GAA TTG CAC AAC AAA TTG TTG GAG ATT TTC CAC ACG 10774 
3444 KSENRQNPELHNKLLE I FHT 3463 

10775 ATA GCC CAA CCC ACC CTG AAA CAC ACC TAC GGT GAG GTG ACG TGG GAG CAA CTT GAG GCG 10834 
3464 lAQPTLKHTYGEVTWEQLEA 3483 

10835 QGG ATA AAT AGA AAG GGG GCA GCA GGC TTC CTG GAG AAG AAG AAC ATC GGA GAA GTA TTG 10894 
3484 GINRKGAAGFLEKKNIGEVL 3503 

10895 GAT TCA GAA AAG CAC CTG GTA GAA CAA TTG GTC AGG GAT CTG AAG GCC GGG AGA AAG ATA 10954 
3504 DSEKHLVEQLVRDLKAGRKI 3523 

10955 AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT GAG AAG AGA GAT GTC AGT GAT GAC TGG CAG 11014 
3524 KYYETAIPKNEKRDVSDDWQ 3543 

11015 GCA GGG GAC CTG GTTG GTT GAG AAG AGG CCA AGA GTT ATC CAA TAC CCT GAA GCC AAG ACA 11074 
3544 AGDLVVEKRPRVIQYPEAKT 3563 

11075 AGG CTA GCC ATC ACT AAG GTC ATG TAT AAC TOG GTG AAA CAS CAG CCC GTT GTG ATT CCA 11134 
3564 RLAITKVMYMWVKOQPVVI P 3583 

11135 GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC ATC TTT GAT AAA GTG AGA AAG GAA TGG GAC 11194 
3584 GYEGKTPLFNIFDKVRKEWD 3603 

11195 TCG TTC AAT GAG CCA GTG GCC GTA AGT TTT GAC ACC AAA GCC TGG GAC ACT CAA GTC ACT 11254 
3604 SFMEPVAVSFDTKAWDTQVT 3623 

11255 ACT AAG GAT CTC CAA CTT ATT GGA GAA ATC CAG AAA TAT TAC TAT AAG AAG GAG TGG CAC 11314 ^ 
3624 SKDLQLIGEIQKYYYKKEWH 3643 

11315 AAG TTC ATT GAC ACC ATC ACC GAC CAC ATC ACA GAA GTA CCA GTT ATA ACA OCA GAT GGT 11374 9 
3644 KFI DTITDHMTEVPVITADG 3663 ^ 

11375 GAA GTA TAT ATA AGA AAT GGG CAG AGA GOG AGC GGC CAG CCA GAC ACA AGT OCT GGC AAC 11434 O 
3664 EVYIRNGQRGSGQPDTSAGN 3683 gj 

11435 AGC ATG TTA AAT CTC CTG ACA ATC ATC TAC GGC TTC TGC GAA AGC ACA GGG CTA CCG TAC 11494 
3684 SMLNVLTMMYGFCESTGVPY 3703 

11495 AAG ACT TTC AAC AGG GTC GCA AGG ATC CAC GTC TCT GGG GAT GAT GGC TTC TTA ATA ACT 11554 
3704 KSFNRVARIHVCGDDGFLIT 3723 

11555 GAAAAAGGGTTAGGOCTGAAATTTGCTAACAAAGGGATCCAGATTCrrCATGAA OCA GGC 11614 
3724 EKGLGLKFANKGMQILHEAG 3743 

11615 AAA CCT CAG AAG ATA ACG GAA GOG GAA AAG ATO AAA GTT GCC TAT AGA TTT GAG GAT ATA 11674 
3744 KPQKITEGEKMKVAYRFEDI 3763 

11675 GAG TTC TOT TCT CAT ACC CCA GTC CCT GTT AGG TGG TCC GAC AAC ACC ACT ACT CAC ATC 11734 
3764 EPCSHTPVPVRWSDNTSSHM 3783 

11735 GCC GOG AGA GAC ACC GCT GTC ATA CTA TCA AAG ATC GCA ACA AGA TTC GAT TCA ACT GGA 11794 
3784 AGRDTAVILSKMATRLDSSG 3803 

11795 GAG AGG GCT ACC ACA GCA TAT GAA AAA GCG CTA GCC TTC ACT TTC TTG CTC ATC TAT TCC 11854 
3804 ERGTTAYEKAVAFSFLLMYS 3823 

11855 TGG AAC CCG CTT GTT AGG AGG ATT TGC CTC TTC GTC CTT TCG CAA CAG CCA GAG ACA GAC 11914 
3824 WNPLVRRICLLVLSQQPETD 3843 

11915 CCA TCA AAA CAT GCC ACT TAT TAT TAC AAA GCT GAT CCA ATA GGG GCC TAT AAA GAT CTA 11974 
3844 PSKHATYYYKGDPIGAYKDV 3863 

11975 ATA GCT CGG AAT CTA ACT GAA CTC AAG AGA ACA GGC TIT GAG AAA TTG OCA AAT CTA AAC 12034 
3864 IGRNLSELKRTGFEKL ANLN 3883 

12035 CTA AGC CTC TCC ACG TTG GGG ATC TOG ACT AAG CAC ACA AGC AAA AGA ATA ATT CAG GAC 12094 
3884 LSLSTLGIWTKHTSKRIZQD 3903 

12095 TCT GTT GCC ATT GOO AAA GAA GAG GGC AAC TOG CTA GTT AAC GCC GAC AGG CTC ATA TCC 12154 
3904 CVAIGKEEGNWLVNADRLIS 3923 

12155 AGC AAA ACT GGC CAC TTA TAC ATA CCT GAT AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT 12214 
3924 SKTGHLYIPDKGFTLQG KHY 3943 



wo 99/55366 



PCT/US99/08850 



BVDV NADL (inf. clone) -> Gt,.-s ^'^^^ 4/21/99 5:42:22 PM Page 

12215 GAG CAA CTG CAG CPA AGA ACA GAG ACA AAC CCG GTC ATG GGG GTT OGG ACT GAG AGA TAC 12274 
3944 EOLOLRTETNPVMCVGTERY 3963 

12275 AAG TTA OGT CCC ATA GTC AAT CTG CTG CTG AGA AGG TTG AAA ATT Cro CTC ATG 12334 
3964 KLGPIVNLLLRRLKILLMTA 3983 

12335 GTC GGC GTC AGO AOC TGA gacaaaacgtatataccgtaaataaaccaacccacgtacacagcgcacacaaatat 12408 
3984 V G V S S • 3989 

12409 agctgggaccgcccaccccaagaagacgacacgcccaacacgcacagccaaacagtagtcaagaccatccaccccaagat 124 68 

12489 aacactacacccaacgcacacagcaccccagccgtatgaggacacgcccgacgtccatagtcggactagggaagacccct 12568 

12569 aacagccccc 12578 
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321 cagcctgacagggcgccgcagaggcccactgtattgctaccaaaaatccctgctgtacatggcac ATG GAG TTG 
1 MEL 



455 GTT TAT GAT CAG GCA GGT GAT CCC TTA TTT GGT GAA AGG GGA GCA GTC CAC CCT CAA TCG 
24VYDQAGDPLPGERGAVH PQ S 

515 ACG CTA AAG CTC CCA CAC AAG AGA GGG GAA OGC GAT GTT CCA ACC AAC TTG GCA TCC TTA 
44TLKLPHKRGERDVPTNLA S L 

575 CCA AAA AGA GGT GAC TGC AGG TCG GGT AAT AGC AGA GGA CCT GTG AGC GGG ATC TAC CTG 
64PKRGDCRSGNSRGPVSGI YL 

635 AAG CCA GGG CCA CTA TTT TAC CAG GAC TAT AAA GGT CCC GTC TAT CAC AGG GCC CCG CTG 
84KPGPLFYQDYKGPVYHRAPL 

695 GAG CTC TTT GAG GAG GGA TCC ATG TGT GAA ACG ACT AAA CGG ATA GGG AGA GTA ACT GGA 
104ELFEEGSMCETTKRIGRVTG 

755 AGT GAC GGA AAG CTG TAC CAC ATT TAT GTC TGT ATA GAT GGA TGT ATA ATA ATA AAA AGT 
124SOGKLyKIYVCIDGCZ Z Z KS 

815GCCACGA6AAGTTACCAAAGGGTGTTCAGGTGGGTCCATAATAGGCTTGACTGCCCTCTA 
144ATRSYQRVFRWVHNRLDC PL 

875 TGG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAGGGAGCAACAAAAAAGAAAACACAGAAA 
164WVTTC SDTKEEGATKKKTQK 

935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GAC AGC 
184PDRLERGKMKIVPKESEKDS 

995 AAA ACT AAA CCT COG GAT GCT ACA ATA GTG GTTG GAA GGA GTC AAA TAC CAG GTG AGO AAG 
204 KTKPPDATIVVEGVKYQVRK 

1055 AAG GGA AAA ACC AAG AGT AAA AAC ACT CAG GAC GGC TTG TAC CAT AAC AAA AAC AAA CCT 

224 KGKTKSKNTQDGLYHNKNK P 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GCG TGG GCA ATA ATA GCT ATA GTT 
244 QESRKKLEKALLAWAI I A IV 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TOG AAC CTA CAA GAT AAT GGG ACG 
264 LFQVTHGENITQWNLQDNGT 

1235 GAA GGG ATA CAA COG GCA ATG TTC CAA AGG GOT GTG AAT AGA AGT TTA CAT GGA ATC TGG 
284 EGIQRAMFQRGVNRSLHO I W 

1295 OCA GAG AAA ATC TGT ACT GGT GTC CCT TCC CAT CTA OCC ACC GAT ATA GAA CTA AAA ACA 
304 PEKZCTGVPSKLATOI ELKT 

1355 ATT CAT GGT ATG ATG GAT GCA AGT GAG AAG ACC AAC TAC ACG TGT TGC AGA CTT CAA CGC 
324 ZHGMMOASEKTNYTCCR LQR 

1415 CAT GAG TGG AAC AAG CAT GGT TGG TGC AAC TOG TAC AAT ATT GAA CCC TOG ATT CTA GTC 
344 HEWNKHGWCNWYNI E PW I LV 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GTC ACT 
364 MNRTQAN LTEGQP PREC AVT 

1535 TGT AGG TAT GAT AGG GCT AGT GAC TTA AAC GTG GTA ACA CAA GCT AGA GAT AGC CCC ACA 
384 CRYDRASDLNVVTQARDS PT 

1595 CCC TTA ACA GGT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATG CGG GGC 
404 PLTGCKKGKNFSFAGI LMRG 

1655 CCC TGC AAC TTT GAA ATA GCT GCA AGT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT AGT 
424 PCNF EI AASDVLFKEHER I S 



80 




160 




240 




320 




394 
3 




454 
23 




514 
43 




574 
63 




634 
83 




694 
103 




754 
123 




814 
143 

874 
163 


s 


934 
183 




994 
203 

1054 
223 




1114 
243 




1174 
263 




1234 
283 




1294 
303 




1354 
323 




1414 
343 




1474 
363 




1534 
383 




1594 
403 




1654 
423 




1714 
443 





1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA GGT GCC 1774 
444 HFQDTTLYLVDGLTNSLEGA 463 
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1775 AGA CAA OGA ACC OCT AAA CTG ACA ACC TGG TTA GGC AAG CAG CTC OGG ATA CTA OGA AAA 1834 

464 RQGTAKLTTWLGKQLG I LGK 483 

183S AAG TTG GAA AAC AAG ACT AAG AOG TGG TTT GGA GCA TAG GCT GCT TCC CCT TAG TGT GAT 1894 

484 KLENKSKTWPGAYAASPYCD 503 



1895 GTC GAT OGC AAA ATT GGC TAG ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TGC TTA CCC 
504 VDRK IGY IWYTKNCT PAC LP 



1954 
523 



1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TIT GAC ACC AAT GCA GAG GAC GGC AAG ATA 
524 KNTK I VG PGKFDTNAEDGKI 



2014 
543 



2015 TTA CAT GAG ATG GGG GGT CAC TTG TCG GAG GTA CTA CTA CTT TCT TTA GTG GTG CTG TCC 2074 

544 LHBMGGHLSEVLLLSLVVLS 563 

2075 GAC TTC OCA CCG GAA ACA GCT AGT GTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564 DFAPETASVMYLI LH FSI PQ 583 

2135 AGT CAC GTT GAT GTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTG GAG CTG ACA 2194 

584 SHVDVMDCDKTQLNLT VELT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TCG GTC TGG AAT CTA GGC AAA TAT GTA TGT ATA AGA CCA 2254 

604 TAEV1PGSVWNLGKYVCIRP 623 



2255 AAT TGG TGG CCT TAT GAG ACA ACT GTA GTG TTG GCA TTT GAA GAG GTG AGC CAG GTG GTG 
624 NWWPYETTVVLAPEEVSQVV 



2314 
643 



2315 AAG TTA GTG TTG AOG GCA CTC AGA GAT TTA ACA OGC ATT TOG AAC GCT GCA ACA ACT ACT 
644 KLVLRALRDLTR I WNAATTT 



2374 
663 



2375 OCT TTT TTA OTA TGC CTT GTT AAG ATA GTC AOG OOC CAG ATG GTA CAG GQC ATT CTG TOG 2434 
664 AFLVCLVKIVRGQMVQGILW 683 



2435 CTA CTA TTG ATA ACA GGG GTA CAA GGG CAC TTG GAT TGC AAA CCT GAA TTC TCG TAT GCC 
684 LLLITGVQGHLDCKPEFSYA 



2494 
703 



2495 ATA GCA AAG GAC GAA AGA ATT GGT CAA CTG GGG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 
704 IAKDERIGQLGAEGLTTTWK 723 



2555 GAA TAC TCA CCT OGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC GAA GAT GGG 

724 EYS PGMKLEDTMV I A WCE DG 



2614 

743 



2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTG CAT ACA 2674 

744 KLMYLQRCTRETRYLAILHT 763 

2675 AGA GCC TTG CCG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG CGA AAG CAA GAG GAT 2734 

764 RALPTSVVFKKLFDGRKQED 783 



I 

r4 



2735 GTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 

784 VVEMNONFEFGLCPCDAKPI 803 

2795 OTAAGAOGGAAOTTCAATACAAOGCTOCTGAACGGACCGGCCTTCCAGATG OTA TGC CCC 2854 

804 VRGKFNTTLLNGPAFQMVCP 823 



2855 ATA OGA TGG ACA 006 ACT GTA AGC TGT AGG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 
824 IGWTGTVSCTSFNMDT LATT 



2914 
843 



2915 CTTG OTA GOG ACA TAT AGA AOG TCT AAA CCA TTC OCT CAT AOG CAA OGC TOT ATC ACC CAA 2974 

B44VVRTYRRSKPFPHRQGCITQ 863 

2975 AAG AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TGT GTG CCT 3034 

864 KNLGEDLHNCILGGNWTCVP 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TGT OGC TAT CAA 3094 

8B4 00QLLYKGGSIESCKWCGYQ 903 

3095 TTT AAA GAO AGT GAG OGA CTA CCA CAC TAC CCC ATT GGC AAG TOT AAA TTG GAG AAC GAG 3154 

904 FKESEGLPHYPIGKCKLENE 923 

3155 ACT GGT TAC AGG CTA CTA GAC ACT ACC TCT TGC AAT AGA GAA GCT GTG GCC ATA CTA CCA 3214 

924 TGYRLVDSTSCNREGVAIVP 943 



3215 CAA GGG ACA TTA AAG TGC AAG ATA GGA AAA ACA ACT CTA CAG GTC ATA GCT ATG GAT ACC 
944 QGTLKCKIGKTTVQVIAMDT 



3274 
963 



3275 AAA CTC GGA CCT ATG CCT TGC AGA CCA TAT GAA ATC ATA TCA ACT GAG GGG CCT CTA GAA 3334 
964 KLGPHPCRPYEIISSEGPVE 983 



3335 AAG ACA OOG TCT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 

984 KTACTFNYTKTLKNK Y F E PR 1003 

3395 GAC AGC TAC TTT CAG CAA TAC ATG CTA AAA GGA GAG TAT CAA TAC TGG TTT GAC CTG GAG 3454 

1004 DSYFQOYMLKGEYQYWFDLE 1023 

3455 OTG ACT GAC CAT CAC CGG GAT TAC TTC GCT GAG TCC ATA TTA CTC GTG CTA CTA GCC CTC 3514 

1024 VTDHHRDYFAESILVVVVAL 1043 
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3515 TTG OGT OGC AGA TAT GTA CTT TCC TTA CTG GTT ACA TAC ATG GTC TTA TCA GAA CAG AAG 3574 

1044 LGGRYVLWLLVTYMVLSEOK 1063 

3575 GCC TTA GOG ATT CAG TAT GGA TCA GGG GAA GTG GTG ATG ATG GGC AAC TIC CTA ACC CAT 3634 

1064 ALGIQYGSGEVVMMGNLLTK 1083 

3635 AAC AAT ATT GAA GTG GTG ACA TAC TTC TTG CTG CTG TAC CTA CTG CTG AGG GAG GAG AGC 3694 

1084 NNIEVVTYFLLLYLLLREES 1103 

3695 GTA AAG AAG TGG GTC TTA CTC TTA TAC CAC ATC TTA GTG GTA CAC CCA ATC AAA TCT GTA 3754 

1104 VKKWVLLLYHILVVHPIKSV 1123 

3755 ATT GTG ATC CTA CTG ATG ATT GGG GAT GTG GTA AAG GCC GAT TCA GGG GGC CAA GAG TAC 3814 

1124 IVILLMIGDVVKADSGGQEY 1143 

3815 TTG GGG AAA ATA GAC CTC TOT nr ACA ACA GTA GTA CTA ATC GTC ATA GOT TTA ATC ATA 3874 

1144 LGKIDLCFTTVVLIVIGLII 1163 

3875 GCC AGG CGT GAC CCA ACT ATA GTG CCA CTG GTA ACA ATA ATG GCA OCA CTG AGG GTC ACT 3934 

1164 ARRDPTIVPLVTIMAALRVT 1183 

3935 GAA CTG ACC CAC CAG CCT GGA GTT GAC ATC OCT GTG GOG GTC ATG ACT ATA ACC CTA CTG 3994 

1184 ELTHQ PGVDIAVAVMTITLL 1203 

3995 ATG GTT AGC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA TOG TTA CAG TGC ATT CTC AGC 4054 

1204 MVSYVTDYFRYKKWLQCILS 1223 

4055 CTC GTA TCT GCG GTG TTC TTG ATA AGA AGC CTA ATA TAC CTA GGT AGA ATC GAG ATG CCA 4114 

1224 LVSAVFLIRSLIYLGRI EHF 1243 

4115 GAG GTA ACT ATC CCA AAC TOG AGA CCA CTA ACT TTA ATA CTA TTA TAT TTG ATC TCA ACA 4174 

1244 EVTIPNWRPLTLILLYLIST 1263 

4175 ACA ATT GTA ACG AGG TOO AAG GTT GAC GTG GCT GGC CTA TTG TTG CAA TGT GTG CCT ATC 4234 

1264 TIVTRWKVDVAGLLLQCVPI 1283 

4235 TTA TTG CTG GTC ACA ACC TTG TGG GCC GAC TTC TTA ACC CTA ATA CTG ATC CTG CCT ACC 4294 

1284 LLLVTTLWADFLTLILILPT 1303 

4295 TAT GAA TTG GTT AAA TTA TAC TAT CTG AAA ACT GTT AGG ACT GAT ATA GAA AGA AGT TGG 4354 

1304 YELVKLYYLKTVRTDIERSW 1323 

4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT GAT GAG AGT GGA GAG 4414 I 

1324 LGG IDYTRVDSIYDVDESGE 1343 ^ 



4415 GGC GTA TAT CTT TTT CCA TCA AGG CAG AAA GCA CAG GOG AAT TTT TCT ATA CTC TTG CCC 4474 

1344 GVYLFPSRQKAQGNFSI LLP 1363 

4475 CTT ATC AAA GCA ACA CTG ATA AGT TGC GTC AGC AGT AAA TOG CAG CTA ATA TAC ATG AGT 4534 
1364 LIKATLISCVSSKWQLIYMS 

4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AGO AAA CTT ATA GAA GAG ATC TCA GGA 4594 

1384 YLTLDFMYYMHRKVIEEISG 1403 

4595 GGT ACC AAC ATA ATA TCC AGG TTA GTG GCA GCA CTC ATA GAG CTG AAC TGG TCC ATG GAA 4654 

1404 GTNI Z SRLVAALIELNWSME 1423 

4655 GAA GAG GAG AGC AAA GGC TTA AAG AAG TTT TAT CTA TPG TCT GGA AGG TTG AGA AAC CTA 4714 

1424 EEESKGLKKFYLLSGRLRNL 1443 

4715 ATA ATA AAA CAT AAG CTA AGG AAT GAG ACC GTG GCT TCT TGG TAC GGG GAG GAG GAA GTC 4774 

1444 I IKHKVRNETVASWYGEEEV 1463 

4775 TAC OCT ATG CCA AAG ATC ATG ACT ATA ATC AAG GCC ACT ACA CTG ACT AAG AGC AGG CAC 4834 

1464 YGMPKIMTI IKASTLSKSRH 1483 

4835 TOC ATA ATA TGC ACT CTA TCT GAG GGC OGA GAG TOG AAA GOT GGC ACC TGC CCA AAA TCT 4894 

1484 CI ICTVCEGREWKGGTCPKC 1503 

4895 GGA OGC CAT GGG AAG CCG ATA ACG TCT GGG ATG TCG CTA GCA GAT TTT GAA GAA AGA CAC 4954 

1504 GRHGKPITCGMSLADFEERH 1523 

4955 TAT AAA AGA ATC TTT ATA AGG GAA GGC AAC TTT GAG gggccc TTC AGG CAG GAA TAC AAT 5014 

1524 YKRIFIREGNFE FRQEYN 1541 

5015 GGC TTT CTA CAA TAT ACC GCT AGG GGG CAA CTA TTT CTG AGA AAC TTG CCC CTA CTG GCA 5074 

1542 GFVQYTARGQLFLRNLPVLA 1561 

5075 ACT AAA OTA AAA ATG CTC ATG CTA GGC AAC CTT GGA GAA GAA ATT GCT AAT CTG GAA CAT 5134 

1562 TKVKHLMVGNLGEE IGNLEH 1581 

5135 CTTGOOTCSGATCCTAAGGGGGCCTGCCGTCTCTAAGAAGATCACAGAGCACGAAAAATGC 5194 

1582 LGWILRGPAVCKKITEHEKC 1601 

5195 CAC ATT AAT ATA CTG GAT AAA CTA ACC GCA TTT TTC GGG ATC ATC CCA AGG GGG ACT ACA 5254 

1602 HINI LDKLTAFFGIMPRGTT 1621 



1383 O 
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5255 COC AGA GCC CXX5 GTG AGG TTC CCT ACG AGC TTA CTA AAA GTG AGG AGO OGT CTG GAG ACT 5314 

1622 PRAPVRFPTSLLKVRRGLET 1641 

5315 GCC TCG GCT TAC ACA CAC CAA GCX: GOG ATA ACT TCA GTC GAC CAT GTA ACC OCC OGA AAA 5374 

1642 AWAYTKQGGISSVOHVTAGK 1661 

5375 GAT CTA CTG GTC TGT GAC AGC ATG GGA OGA ACT AGA GTG GTT TGC CAA AGC AAC AAC AGG 5434 

1662 DLLVCDSMGRTRVVCQSNNR 1681 

5435 TTG ACC GAT GAG AGA GAG TAT GGC GTC AAG ACT GAC TCA OGG TGC CCA GAC GGT GCC AGA 5494 

1682 LTDETEYGVKTDSGC PDGAR 1701 

5495 TCT TAT GTG TTA AAT CCA GAG GCC GTT AAC ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC 5554 

1702 CYVLNPEAVNISGSKGAVVH 1721 

5555 CTC CAA AAG ACA GGT GGA GAA TTC ACG TGT GTC ACC GCA TCA GGC ACA CCG GCT TTC TTC 5614 

1722 LQKTGGE.FTCVTASGTPAF F 1741 

5615 GAC CTA AAA AAC TTG AAA GGA TGG TCA GGC TTG CCT ATA TTT GAA GCC TCC AGC GGG AGG 5674 

1742 DLKNLKGWSGLPIFEASSGR 1761 

5675 GTC GTT GGC AGA GTC AAA GTA GGG AAG AAT GAA GAG TCT AAA CCT ACA AAA ATA ATG AGT 5734 

1762 VVGRVKVGKNEESKPTKIMS 1781 

5735 GGA MC CAG ACC GTC TCA AAA AAC AGA GCA GAC CTG ACC GAG ATG GTC AAG AAG ATA ACC 5794 

1782 QXQTVSKNRADLTEHVKKIT 1801 

5795 AGC ATG AAC AGG GGA GAC TTC AAG CAG ATT ACT TTG GCA ACA GGG GCA GGC AAA ACC ACA 5854 

1802 SMNRGDFKQITLATGAGKTT 1821 

5855 GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA 5914 

1822 ELPKAVIEEIGRHKRVLVLI 1841 

5915 CCA TTA AGG GCA GCG GCA GAG TCA GTC TAC CAG TAT ATG AGA TTG AAA CAC CCA AGC ATC 5974 

1842 PLRAAAESVYQYMRLKHPSI 1861 

5975 TXTT TTT AAC CTA AGG ATA GGG GAC ATG AAA GAG GGG GAC ATG GCA ACC GOG ATA ACC TAT 6034 

1862 SFNLRIGDMKEGDMATGITY 1881 

6035 OCA TCA TAC GGG TAC TTC TGC CAA ATG CCT CAA CCA AAG CTC AGA GCT GCT ATG GTA GAA 6094 

1882 ASYGYFCQMPQPKLRAAMVE 1901 

6095 TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT TGT GCC ACT CCT GAA CAA CTG OCA ATT ATC 6154 ^ 

1902 YSYIFLDEYHCATPEQLAI I 1921 ^ 

6155 GGG AAG ATC CAC AGA TTT TCA GAG AGT ATA AGG GTT GTC GCC ATG ACT GCC ACG CCA GCA 6214 

1922 OKIHRFSES IRVVAMTATPA 1941 g 

6215 GGG ICG GIX3 ACC ACA ACA GGT CAA AAG CAC CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA 6274 ^ 

1942 GSVTTTGQKHPIEEFIAPEV 1961 ^ 

6275 ATC AAA GGG GAG GAT CTT GGT AGT CAG TTC CTT GAT ATA GCA GGG TTA AAA ATA CCA GTG 6334 ^ 

1962 MKGEDLGSOFLDIAGLKIPV 1981 

6335 GAT GAG ATG AAA GGC AAT ATG TTG GTT TTT GTA CCA ACG AGA AAC ATG GCA GTA GAG GTA 6394 

1982 DEMKGNMLVFVPTRNMAVEV 2001 

6395 OCA AAG AAG CTA AAA GCT AAO GGC TAT AAC TCT GGA TAC TAT TAC AGT OGA GAG GAT CCA 6454 

2002 AKKLKAKOYNSGYYYSOEDP 2021 

6455 GCC AAT CTG AGA GTT GTG ACA TCA CAA TCC CCC TAT GTA ATC GTG GCT ACA AAT GCT ATT 6514 

2022 ANLRVVTSQSPYVIVATNAI 2041 

6515 GAA TCA GGA GTG ACA CTA CCA GAT TTG GAC ACG GTT ATA GAC ACG GGG TTG AAA TGT GAA 6574 

2042 ESGVTLPDLDTVIDTGLKCE 2051 

6575 AAG AGG GTG A06 CTA TCA TCA AAG ATA CCC TTC ATC GTA ACA GGC CTT AAO AGO ATG GCC 6634 

2062 KRVRVSSKI PFIVTGLKRMA 2081 

6635 OTOACTCTrGGGTGAGCAGGCGCAGCGTAOGOGCAGAGTAOGTAGAGTGAAACCCOGGAGG 6694 

2082 VTVGEQAQRRGRVGRVKPGR 2101 

6695 TAT TAT AGG ACC CAG GAA ACA GCA ACA GGG TCA AAG GAC TAC CAC TAT GAC CTC TTG CAG 6754 

2102 YYRSQETATGSKDYHYDLLQ 2121 

6755 GCA CAA AGA TAC GGG ATT GAG GAT GGA ATC AAC GTG AOG AAA TCC TTT AGG GAG ATG AAT 6814 

2122 AQRYGI EDGINVTKSFREMN 2141 

6815 TAC GAT TGG AGC CTA TAC GAG GAG GAC AGC CTA CTA ATA ACC CAG CTG GAA ATA CTA AAT 6874 

2142 YDWSLYEEDSLLITQLEILN 2161 

6875 AAT CTA CTC ATC TCA GAA GAC TTG CCA GCC OCT GTT AAG AAC ATA ATG GCC AGG ACT GAT 6934 

2162 NLLISEDLPAAVKNIMARTD 2181 

6935 CAC CCA GAG CCA ATC CAA CTT GCA TAC AAC AGC TAT GAA GTC CAG GTC CCG GTC CTG TTC 6994 

2182 HPEPIQLAYNSYEVOVPVLF 2201 
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6995 CCA AAA ATA AGG AAT GGA GAA GTC ACA CAC ACC TAC GAA AAT TAC TCG TTT CTA AAT GCC 7054 
2202 PKIRNGEVTDTYENYSFLNA 2221 

7055 AGA AAG TTA OGG GAG GAT GTG CCC GTG TAT ATC TAC GCT ACT GAA GAT GAG GAT CTG OCA 7114 
2222 RKLGEDVPVYIYATEDEDLA 2241 

7115 GTT GAG CTC TTA OGG CTA GAC TOG CCT GAT CCT GGG AAC CAG CAG GTA CTG GAG ACT GGT 7174 
2242 VDLLGLDWPDPGNOOVVETG 2261 

7175 AAA GCA CTG AAG CAA GTG ACC GGG TTG TCC TCG GCT GAA AAT GCC CTA CTA GTG GCT TTA 7234 
2262 KALKQVTGLSSAENALLVAL 2281 

7235 TTT GGG TAT GTG GGT TAC CAG GCT CTC TCA AAG AGG CAT GTC CCA ATG ATA ACA GAC ATA 7294 
2282 FGYVGYQALSKRHVPMI TDI 2301 

7295 TAT ACC ATC GAG GAC CAG AGA CTA GAA GAC ACC ACC CAC CTC CAG TAT GCA CCC AAC GCC 7354 
2302 YTIEDQRLEDTTHLQYAPNA 2321 

7355 ATA AAA ACC GAT GGG ACA GAG ACT GAA CTG AAA GAA CTG GCG TCG GGT GAC GTC GAA AAA 7414 
2322 IKTDGTETELKELASGDVEK 2341 

7415 ATC ATG GGA GCC ATT TCA GAT TAT GCA GCT GGG GGA CTG GAG TTT GTT AAA TCC CAA GCA 7474 
2342 IMGAISOYAAGGLEFVKSQA 2361 

7475 GAA AAG ATA AAA ACA GCT CCT TTG TTT AAA GAA AAC GCA GAA GCC GCA AAA GGG TAT GTC 7534 
2362 EKIKTAPLFKENAEAAKGYV 2381 

7535 CAA AAA TTC ATT GAC TCA TTA ATT GAA AAT AAA GAA GAA ATA ATC AGA TAT OGT TTG TGG 7594 
2382 QKFIDSLIENKEEI ZRYGLW 2401 

7595 GGA ACA CAC ACA GCA CTA TAC AAA AGC ATA GOT OCA AGA CTG GGG CAT GAA ACA OCG TTT 7654 
2402 OTHTALYKSIAARLGHETAF 2421 

7655 GCC ACA CTA GTG TTA AAG TGG CTA GCT TIT GGA GGG GAA TCA GTG TCA GAC CAC GTC AAG 7714 
2422 ATLVLKWLAFGGESVSDHVK 2441 

7715 CAG GCG GCA GTT GAT TTA GTG GTC TAT TAT GTG ATG AAT AAG CCT TCC TTC CCA GGT GAC 7774 
2442 QAAVDLVVYYVMNKPSF PGD 2461 

7775 TCCGAGACACAGCAAGAAGGQAGGCGATTCOTCGCAAGCCTGTrCATCTCCGCA CTG GCA 7834 
2462 SETQQEGRRFVASLFISALA 2481 

7835 ACC TAC ACA TAC AAA ACT TGG AAT TAC CAC AAT CTC TCT AAA GTG GTG GAA CCA GCC CTG 7894 If) 
2482 TYTYKTWNYHNLSKVVE PAL 2501 

7895 GCT TAC CTC CCC TAT GCT ACC AGC GCA TTA AAA ATG TTC ACC CCA ACG COG CTG GAG AGC 7954 ^ 
2502 AYLPYATSALKMFTPTRLES 2521 g 

7955 GTG GTO ATA CTG AGC ACC AOG ATA TAT AAA ACA TAC CTC TCT ATA AGG AAG OGG AAG AGT 8014 p 
2522 VVILSTTIYKTYLSIRKGKS 2541 

8015 GAT OGA TTG CTG OGT ACG GGG ATA AGT OCA GCC ATG GAA ATC CTG TCA CAA AAC CCA GTA 8074 ^ 
2542 DGLLGTGISAAMEI LSQNPV 2561 

8075 TOG GTA GGT ATA TCT GTG ATG TTG GGG GTA GGG GCA ATC GCT GOG CAC AAC GCT ATT GAG 8134 
2562 SVGISVMLGVGAIAAHNAI E 2581 

8135 TCC AGT GAA CAG AAA AOG ACC CTA CTT ATG AAG QTQ TTT GTA AAG AAC TTC TTG GAT CAG 8194 
2582 SSEQKRTLLMKVFVKNFLDQ 2601 

8195 GCT GCA ACA GAT GAG CTG GTA AAA GAA AAC CCA GAA AAA ATT ATA ATC GCC TTA TTT GAA 8254 
2602 AATDELVKENPEKI IMALFE 2621 

8255 GCA GTC CAO ACA ATT OOT AAC CCC CTC AGA CTA ATA TAC CAC CTC TAT OGG GTT TAC TAC 8314 
2622 AVQTIGNPLRLIYHLYGVYY 2641 

8315 AAA GGT TGG GAG GCC AAG GAA CTA TCT GAG AGG ACA GCA GOC AGA AAC TTA TTC ACA TTC 8374 
2642 KGWEAKELSERTAGRNLFTL 2661 

8375 ATA ATC TTT GAA GCC TTC GAG TTA TTA GGG ATC GAC TCA CAA GGG AAA ATA AGG AAC CTC 8434 
2662 IMFEAFELLGMDSOGKI RNL 2681 

8435 TCC GGA AAT TAC ATT TTC GAT TTG ATA TAC GGC CTA CAC AAG CAA ATC AAC AGA GGG CTC 8494 
2682 SGNYILDLZYGLHKQINRGL 2701 

8495 AAG AAA ATC GTA CTG GGG TGG GCC CCT GCA CCC TTT AGT TOT GAC TGG ACC CCT AGT GAC 8554 
2702 KKMVLGWAPAPFSCDWTPSD 2721 

6555 GAG AGG ATC AGA TTG CCA ACA GAC AAC TAT TTG AOG GTA GAA ACC AGG TGC CCA TOT GGC 8614 
2722 ERIRLPTDNYLRVETRC PCG 2741 

8615 TAT GAQ ATG AAA OCT TIC AAA AAT OTA GOT GGC AAA CTT ACC AAA GTG GAG GAG AGC GGG 8674 
2742 YEMKAFKNVOOKLTKVEESG 2761 

8675 CCT TTC CTA TGT AGA AAC AGA CCT GCT AOG OGA CCA GTC AAC TAC AGA CTTC ACC AAG TAT 8734 
2762 PFLCRNRPGRGPVNYRVTKY 2781 
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8735 TAC GAT GAC AAC CTC AGA GAG ATA AAA CCA GTA GCA AAG TTG GAA GGA CAG GTA GAG CAC 8794 

2782 YODNLRCIKPVAKLEGOVEH 2801 

8795 TAC TAC AAA GGG GTC ACA GCA AAA ATT GAC TAC ACT AAA GGA AAA ATG CTC TTG GCC ACT 8854 

2802 YYKGVTAKIDYSKGKMLLAT 2821 

8855 GAC AAG TGG GAG GTG GAA CAT GGT GTC ATA ACC AQG TTA GCT AAG AGA TAT ACT GGG GTC 8914 

2822 DKWEVEHGVITRLAKRVTGV 2841 

8915 GGG TTC AAT GGT GCA TAC TTA GGT GAC GAG CCC AAT CAC CGT GCT CTA GTG GAG AGG GAC 8974 

2842 GFNGAYLGDEPNHRALVERD 2861 

8975 TGT GCA ACT ATA ACC AAA AAC ACA GTA CAG TTT CTA AAA ATG AAG AAG GGG TGT QCG TTC 9034 

2862 CATI TKNTVQFLKMKKGCAF 2881 

9035 ACC TAT GAC CTG ACC ATC TCC AAT CTG ACC AGG CTC ATC GAA CTA GTA CAC AGG AAC AAT 9094 

2882 TYDLTISNLTRLIELVHRNN 2901 

9095 CTT GAA GAG AAG GAA ATA CCC ACC GCT ACG GTC ACC ACA TGG CTA GCT TAC ACC TTC GTG 9154 

2902 LEEKEI PTATVTTWLAYTFV 2921 

9155 AAT GAA GAC GTA GGG ACT ATA AAA CCA GTA CTA GGA GAG AGA GTA ATC CCC GAC CCT GTA 9214 

2922 NEDVGTIKPVLGERVI PDPV 2941 

9215 GTT GAT ATC AAT TTA CAA CCA GAG GTG CAA GTG GAC ACG TCA GAG GTT GGG ATC ACA ATA 9274 

2942 VDINLQPEVQVOTSEVGITI 2961 

9275 ATT GGA AOG GAA ACCCrGATGACAACGGGAGTGACACCTGTCTTGGAAAAACTrAGAGCCT 9334 

2962 IGRETLMTTGVTPVLEKVEP 2981 

9335 GAC0CCAGCGACAACCAAAACTCGGTGAAGATCGGGTT6GATGAG GGT AAT TAC CCA GGG 9394 

2982 DASDNQNSVKIGLDEGNYPG 3001 

9395 CCT GGA ATA CAG ACA CAT ACA CTA ACA GAA GAA ATA CAC AAC AGG GAT GCG AGG CCC TTC 9454 

3002 PGIQTHTLTEEIHNRDARPF 3021 

9455 ATC ATG ATC CTG GGC TCA AQG AAT TCC ATA TCA AAT AGG GCA AAG ACT GCT AGA AAT ATA 9514 

3022 IMILGSRNSISNRAKTARNI 3041 

9515 AAT CTG TAC ACA OGA AAT GAC CCC AOG GAA ATA CGA GAC TTG ATG OCT GCA GGG CGC ATG 9574 \^ 

3042 NLYTGNDPREIROLMAAGRH 3061 



9575 TTA GTA GTA GCA CTG AGG GAT GTC GAC CCT GAG CTG TCT GAA ATG GTC GAT TTC AAG OGG 9634 

3062 LVVALRDVDPELSEMVDFKG 3081 

9635 ACT TTT TTA GAT AGG GAG GCC CTG GAG GCT CTA AGT CTC GGG CAA CCT AAA CCG AAG CAG 9694 

3082 TFLDREALEALSLGOPKPKQ 3101 

9695 GTT ACC AAG GAA GCT GTT AQG AAT TTG ATA GAA CAG AAA AAA GAT GTG GAG ATC CCT AAC 9754 

3102 VTKEAVRNLZEQKKDVEI PN 3121 

9755 TOG TTT GCA TCA GAT GAC CCA GTA TTT CTG GAA GTG GCC TTA AAA AAT GAT AAG TAC TAC 9814 

3122 WFASDDPVFLEVALKNDKYY 3141 

9815 TTA GTA GGA GAT GTT GGA GAG CTA AAA GAT CAA GCT AAA GCA CTT GGG GCC ACG GAT CAG 9874 

3142 LVGDVGELKDQAKALGATDQ 3161 

9875 ACA AGA ATT ATA AAG GAG GTA GGC TCA AGG ACG TAT GCC ATG AAG CTA TCT AGC TGG TTC 9934 

3162 TRIIKEVQSRTYAMKLSSWF 3181 

9935 CTC AAG GCA TCA AAC AAA CAG ATG ACT TTA ACT CCA CTG TTT GAG GAA TTG TTG CTA CGG 9994 

3182 LKASNKQMSLTPLFEELLLR 3201 

9995 TQC CCA CCT GCA ACT AAG AGC AAT AAG GGG CAC ATG GCA TCA GCT TAC CAA TTG GCA CAG 10054 

3202 CPPATKSNKGHMASAYQLAQ 3221 

10055 GGT AAC TGG GAG CCC CTC GCT TGC GGG GTG CAC CTA GCT ACA ATA CCA GCC AGA AGG GTG 10114 

3222 GNWEPLGCGVHLGTI PARRV 3241 

10115 AAG ATA CAC CCA TAT GAA GCT TAC CTG AAG TTG AAA GAT TTC ATA GAA GAA GAA GAG AAG 10174 

3242 KIHPYEAYLKLKDFI EEEEK 3261 

10175 AAA CCT AGG (TTT AAG GAT ACA CTA ATA AGA GAG CAC AAC AAA TGG ATA CTT AAA AAA ATA 10234 

3262 KPRVKDTVIREHNKWI LKKI 3281 

10235 AGG TTT CAA GGA AAC CTC AAC ACC AAG AAA ATG CTC AAC CCG GGG AAA CTA TCT GAA CAG 10294 

3282 RFQGNLNTKKMLNPGKLSEQ 3301 

10295 TTG GAC AGG GAG GGG CGC AAG AGG AAC ATC TAC AAC CAC CAG ATT OCT ACT ATA ATG TCA 10354 

3302 LDRBGRKRKIYNHQIGTIMS 3321 

10355 AGT GCA GGC ATA AOG CTG GAO AAA TTG CCA ATA GTG AGG GCC CAA ACC GAC ACC AAA ACC 10414 

3322 SAGIRLEKLPIVRAQTDTKT 3341 

10415 TTT CAT GAG GCA ATA AGA GAT AAG ATA GAC AAG ACT GAA AAC CGG CAA AAT CCA GAA TTG 10474 
3342 FHSAIRDKIDKSENRQNPEL 3361 
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10475 CAC AAC AAA TTG TTG GAG ATT TTC CAC ACQ ATA GCC CAA CCC ACC CTG AAA CAC ACC TAC 10534 

3362 HNKLLEIFHTIAQPTLKHTY 3381 

10535 GGTGAGCnxSACGTOGGAGCAACTTGAGGCGGGGATAAATAGAAAGOQGGCAGCAGGCTTC 10594 

3382 GEVTWEOLEAGINRKGAAG P 3401 

10595 CTG GAG AAG AAG AAC ATC GGA GAA GTA TTG GAT TCA GAA AAG CAC CTG GTA GAA CAA TTG 10654 

3402 LEKKNIGEVLDSEKHLVEQL 3421 

10655 GTC AGG GAT CTG AAG GCC GGG AGA AAG ATA AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT 10714 

3422 VRDLKAGRKIKYYETAI PKN 3441 

10715 GAG AAG AGA GAT GTC AGT GAT GAC TGG CAG GCA GGG GAC CTG GTG GTT GAG AAG AGG CCA 10774 

3442 EKRDVSDDWQAGDLVVEKRP 3461 

10775 AGA C7IT ATC CAA TAC CCT GAA GCC AAG ACA AGG CTA GCC ATC ACT AAG GTC ATG TAT AAC 10834 

3462 RVIQYPEAKTRLAITKVMYN 3481 

10835 TGG GTG AAA CAG CAG CCC GTT GTG ATT CCA GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC 10894 

3482 WVKQQPVVIPGYEGKTPLFN 3501 

10895 ATC TTT GAT AAA GTG AGA AAG GAA TGG GAC TCG TTC AAT GAG CCA GTG GCC GTA AGT TTT 10954 

3502 IFDKVRKEWDSFNEPVAVSF 3521 

10955 GAC ACC AAA GCC TGG GAC ACT CAA GTG ACT AGT AAG GAT CTG CAA CTT ATT OGA GAA ATC 11014 

3522 DTKAWDTQVTSKDLQLIGEI 3541 

11015 CAG AAA TAT TAC TAT AAG AAG GAG TGG CAC AAG TTC ATT GAC ACC ATC ACC GAC CAC ATG 11074 

3542 QKYYYKKEWHKFIDTITDHM 3561 

11075 ACA GAA GTA CCA GTT ATA ACA GCA GAT GGT GAA GTA TAT ATA AGA AAT OGQ CAG AGA GGG 11134 

3562 TEVPVITADGEVYIRNGQRG 3581 

11135 AGC GGC CAG CCA GAC ACA AGT GCT OGC AAC AGC ATG TTA AAT GTC CTG ACA ATG ATG TAC 11194 

3582 SGQPDTSAGNSMLNVLTMMY 3601 

11195 GGC rrC TGC GAA AGC ACA GGG GTA CCG TAC AAG AGT TTC AAC AGG GTG GCA AGG ATC CAC 11254 

3602 CFCESTGVPYKSFNRVARIH 3621 

11255 GTC TGT GGG GAT GAT OGC TTC TTA ATA ACT GAA AAA GOG TTA OGG CTG AAA TTT OCT AAC 11314 

3622 VCGDDGFLITEKGLGLKFAN 3641 

11315 AAA GGG ATG CAG ATT CTT CAT GAA GCA GGC AAA CCT CAG AAG ATA ACG GAA GGG GAA AAG 11374 

3642 KGMQI LHEAGKPQKITEGEK 3661 

11375 ATG AAA GTT GCC TAT AGA TTT GAG GAT ATA GAG TTC TGT TCT CAT ACC CCA GTC CCT GTT 11434 

3662 MKVAYRPEDIEFCSHTPVPV 3681 

11435 AGG TGG TCC GAC AAC ACC AGT AGT CAC ATG GCC GGG AGA GAC ACC GCT GTG ATA CTA TCA 11494 

3682 RWSONTSSHHAGRDTAVI LS 3701 



11495 AAG ATG GCA ACA AGA TTG GAT TCA AGT GGA GAG AGG GOT ACC ACA GCA TAT GAA AAA GCG 
3702 KMATRLDSSGERGTTAYEKA 



11554 
3721 



11555 GTA GCC TTC AGT TTC TTG CTG ATG TAT TCC TOG AAC CCG CTT GTT AOG AOQ ATT TGC CTG 11614 

3722 VAFSFLLMYSWNPLVRRI CL 3741 

11615 TTG GTC CTT TCG CAA CAG CCA GAG ACA GAC CCA TCA AAA CAT GCC ACT TAT TAT TAC AAA 11674 

3742 LVLSQQPETDPSKHATYYYK 3761 

11675 GGT GAT CCA ATA GGG GCC TAT AAA GAT GTA ATA GGT CGG AAT CTA AGT GAA CTG AAG AGA 11734 

3762 GDP IGAYKDVIGRNLS ELKR 3781 

11735 ACA OGC TTT GAG AAA TTG GCA AAT CTA AAC CTA AGC CTG TCC ACC TTG GGG ATC TGG ACT 11794 

3782 TGF EKLANLNLSLSTLGIWT 3801 

11795 AAG CAC ACA AGC AAA AGA ATA ATT CAG GAC TOT GTT GCC ATT GGG AAA GAA GAG GGC AAC 11854 

3802 KHTSKRI IQDCVAIGKEEGN 3821 

11855 TGG CTA CTTT AAC GCC GAC AGG CTG ATA TCC AGC AAA ACT GGC CAC TTA TAC ATA CCT GAT 11914 

3822 WLVNADRLISSKTGHLYI PD 3841 



11915 AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT GAG CAA CTG CAG CTA AGA ACA GAG ACA AAC 
3842 KGFTLQGKHYEQLQLRTETN 



11974 
3861 



11975 CCG GTC ATG OOG GTT GGG ACT GAG AGA TAC AAG TTA GGT CCC ATA GTC AAT CTG CTG CTG 12034 
3862 PVMGVGTERYKLGPIVNLLL 3881 

12035 AGAAGGTTGAAAATTCrGCTCATQAOGGCCGTCGOCGTCAGCAOCTGA gacaaaacgCACataC 12098 
3882 RRLKILLMTAVGVSS* 3897 

12099 tgcaaatannttaatccatgtacatagtgcatataaatatagttgggaccgcccaectcaagaagacgacacgcccaaca 1 2178 

12 179 cgcacagc caaacagcagccaagac cacc cacc ccaagacaacaccaca t ccaatgcacacagcacc t cagccgcacgag 12258 

12259 gacacgcccgacgtccacagtcggaccagggaagacctccaacagccccc 12308 
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GTATaatcactcccclgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgttagtatgagtgtcgt^^ 
gaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgaccgggtcctttcttggata 
aacccgctcaatgcctggagamgggcgtgcccccgcaagactgctagccgagtagtgttgggtcgcgaaaggccngtggt 
ctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 13 
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GTaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtcmgccatggcgtiagtatgagtgtcgtgcagcctc^ 
cccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgaccgggtcctttcttggata^ 
ccgctcaatgcctggagatttgggcgtgccxccgcaagactgciagccgagtagtgttgggtcgcgaaaggccttgtggtac^^^ 
atagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 14 
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GTATacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgtmgtatgap^ 
tgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgaccggg 
icctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaagactgctagccgagtagtgttgK^ 
cttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgigcaccATG 



FIGURE 15 
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GTATCAGAAGTGCGAATGCTGAacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaa 

gcgtctagccatggcgttagtatgagtgtcgtgcagcctccaggaccccxcctoccgggagagccataglggtctgcggaacc 

agtacaccggaattgccaggacgaccgggtcctttcttggataaacccgctcaatgcctggagatligggcgtgc^ 

ctagccgagtagtgttgggtcgcgaaaggccttgtggtactgcctgatagggtgcttgcgagtgccccgggaggte 

caccATG 



FIGURE 16 
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GTATgccagccccctgatgggggcgacaciccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctag 

ccatggcg^tagtatgagtgtcgtgcagcctccaggaccccccxtcccgggagagccatagtggtctgcggaaccggtg 

ggaangccaggacgaccgggtccmcttggauaacccgctcaatgcctggagatttgggcgtgcccccgcaagactgcta^ 

gtagtgngggtcgcgaaaggccttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccg^^^ 

G 



FIGURE 17 
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GTATTGCAGTITgccagccccctgatgggggcgacactccaccatgaaicactcccctgtgaggaactactgtcttcacgc 

agaaagcgtctagccatggcgttagjatgagtgtcgtgcagtxtccaggacccccccicccgggagagccamgtg^^^ 

cggtgagtacaccggaaagccaggacgaccgggtcctttcttggataaacccgctcaatgcctggagamgggcgtgcccccgc^ 

gactgciagccgagtagtgttgggtcgcgaaaggccttgtggtactgcctgatagggigctigcgagtgccccgggaggic^^ 

ccgtgcaccATG 



FIGURE 18 
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GAAGCCAGGGC 
CCCCGCTGGAO 
KSGAGAGTAAa 
jGATGTATAATy 
K3TCCATAATAG 
f^GAAGAGGGAG 
jGGAAAATGAA 

tccggatgcta 
3gaaaaaccaa 

tacatggaatctggccagag/Saat^g^^^ 



lAGAGAG 
TGACTGC 
CCAGGGC 
jCTGGAGC 
AGTAACT 
JTATAATA 
ATAATAO 
3AGGGAG 
^TGAA 
SATGCTA 
UACCAA 

:aggaat 

rATAGTT 
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CAAGACH3ATGTAGTCGAAATGAACGACAACTITGAATITGGACTCTGCCCATGT 

GATGCCAAACCCATAGTAAGAGGGAAGTTCAATACAACGCTGCTG/VACGGACC 

GGCCTTCCAGATGGTATGCCCCATAGGATGGACAGGGACTGTAAGCTGTACGT 

CATTCAATATGGACACCTTAGCCACAACTGTGGTACGGACATATAGAAGGTCTA 

AACCATTCCCTCATAGGCAAGGCTGTATCACCCAAAAGAATCTGGGGGAGGAT 

CTCCATAACTGCATCCTTGGAGGAAATTGGACTTGTGTGCCTGGAGACCAAC^^ 

CTATACAAAGGGGGCrCTATTGAATCITGCAAGTGGTGTGGCTATCAATrrAAA 

GAGAGTGAGGGACTACCACACTACCCCL^TTGGCAAGTGTAAATrGGAGAACGA 

GACTGGTTACAGGCTAGTAGACAGTACCTCTTGCAATAGAGAAGGTGTGGCCA 

TAGTACCACAAGGGACATrAAAGTGCAAGATAGGAAAAACAACTGTACAGGTC 

ATAGCTATGGATACCAAACTCGGACCTATGCCrTGCAGACCATATG 

TCAAGTGAGGGGCC TGTAG AAAAGACAGCGTGTACTTTCAACTACACTAAGAC 

ATTAAAAAATAAGTATTITGAGCCCAGAGACAGCTACITrCAGC^ 

AAAAGGAGAGTATCAATACTGGTTTGACCTGGAGGTGACTGACCATCACCXKjG 

ATTAC TTCG CTGAGTCCATATTAGTGGTGGTAGTAGCCCTCTTGGGTGGCAGAT 

ATGTACirrGGTTACTGGTTACATACATGGTCTrATCAGAACAGAAGGCOT 

GGATTCAGTATGGATCAGGGGAAGTGGTGATGATGGGCAACTTGCTAACCCAT 

AACAATATTGAAGTGGTGACATACTTCITGCTGCTGTACCTACT 

GAGAGCGTAAAGAAGTGGGTCITACTCTrATACCACATCTrAGTGGTACAC^^ 

ATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGTGGTAAA 

TCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCTCTGTTTTACA^ 

ACTAATCXjTCATAGGTITAATCATAGCCAGGCGTGACCCAACTATAGTGCCACT 

GGTAACAATAATGGCAGCACTGAGGGTCACIX3AACTGACCCACCAGCCT 

TTGACATCGCTGTGGCGGTCATGACTATAACCCTACTGATGGTTAGCrATGTGA 

CAGATTATTlTAGATATAAAAAATGGTTACAGTGCATIXJrCAGCCT 

GGTGrrCITGATAAGAAGCCTAATATA CCrA GGTAGAATCGAGATGCCAGAGG 

TAACTATCCCAAACTGGAGACCACTAACmAATACTATTATATITGATC^^ 

AACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCCTATTGTTGCAATGT^ 

TGCCTATCTTATTGCraGTCACAACCTTCT 

GATCCTGCCTACCTATGAATIXXnTAAATTATACTATCTGAA^ 

GATATAGAAAGAAGTrGGCTAGGGGGGATAGACTATACAAGAGTTGACTCCAT 

CTACGACGTTGAT GAGAG TGGAGAGGGCGTATATCTTTTTCCATCAAGGCAGA 

AAGCACAGGGGAATTTTTCTATACTCTTGCCCCI^ 

GTTGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTrACTTAACmGGACT 

TTATGTACrACATGCACAGGAAAGTTATAGAAGAGATCrCAGGAGGTACCAACA 

TAATATCCAGGTTAGTGGCAGCAC TCATA GAGCTGAACTGGTCCATGGAAGAA 

GAGGAGAGCAAAGGCirAAAGAAGTTTTATCrATTCTCrGGAAGGT^ 

CCTAATAATAAAACATAAGGTAAGGAATGAGACCXn'GGCITCTrGGT^ 

AGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATCAAGGrc 

CTX3AGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGAGGGCCGAGAG 

GAAAGGTGGCACXJTGC CCAAA ATGTGGACGCCATGGGAAGCCGATAACGTGT 

GGGATGTOjCTAGCAGATT^ 

GAAGGCAACTTTGAGGGTATGTGCAGCCGATGCCAGGGAAAGCATAGGAGGT 

TTGAAATGGACCGGGAACCTAAG AGTGC CAGATACTGTGCTGAGTGTAATAGG 

CTGCATCCTGCT GAGG AAGGTGACnTTGGGCAGAGTCGAGCATGTTGGGCCT 

CAAAATCACCTACmGCGCTGATGGATGGAAAGGTGTATGATATCACAGAGTG 

GGCTGGATGCCAGCGTGTGGGAATCTCCCCAGATACCCACAGAGTCCCITGTC 

ACATGTCATTTGGTTCACGGATGCCTITCAGGCAGGAATACAATGGCm 

AATATACCGCTAGGGGGCAACTATITCTGAGAAACTrGCCCGTACTGG^ 

AAGTAAAAATGCTCATGGTAGGCAACCTIX3GAGAAGAAATTGGTAAT 

CATCTTGGGTGGATCCTAAGGGGGCCTGCCX3TGTGTAAGAAGATCA 

CGAAAAATGCCACATrAATATACrGGATAAACTAACXXK:ATrmC 

GCCAAGGGGGACTACA(XCAGAGCCCCGGTGAGGTTCCCTACGAGCTTACT 

AAGTGAGGAGGGGTCTGGAGACnXjCCTGGGCTrACACACACCAAGGCGGGAT 
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AAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGTCTGTGACAGCA 

TGGGACGAACTAGAGTGGTTTGCCAAAGCAACAACAGGTTGACCGATGAGACA 

GAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGCCAGATGTTATGT 

GTTAAATCCAGAGGCCGTTAACATATCAGGATCCAAAGGGGCAGTCGTTCACC 

TCCAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCAGGCACACCGGCT 

TTCTTCGACCTAAAAAACITGAAAGGATGGTCAGGCTTGCCTATATTTGAAGCC 

TCCAGCGGGAGGGTGG1TGGCAGAGTCAAAGTAGGGAAGAATGAAGAGTCTA 

AACCTACAAAAATAATGAGTGGAATCCAGACCGTCTCAAAAAACAGAGCAGAC 

CTGAC XX}AG ATGGTQ\AGAAGATAA(X:AGCATGAACAGGGGAGACrTCAAGCA 

GATTACTTTGGCAACAQGGGCAGGCAAAACCACAGAACTCCCAAAAGCAGTTA 

TAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATACCATTAAGGGCA 

GCGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCCAAGCATCTCrnT 

AACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAACCGGGATAACCT 

ATGCATCATACGGGTAOTCTGCCAAATGCCTCAACCAAAGCrCAGAGCTGCTA 

TGGTAGAATACTCATACATATTCTTAGATGAATACCATTGTGCCACrCCrGAACA 

ACTGGCAATTATCGGGAAGATCCACAGATnTCAGAGAGTATAAGGGTTGTCG 

CXZATGACTGCCACGCCAGCAGGGTCXKJTGACCACAACAGarCAAAAGCACCCA 

ATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGGATCTTGGTAGTCA 

GTrccrrGATATAGCAGGGTTAAAAATACCAGTGGATGAQATGAAAGGCAATAT 

GTTGGTriTIXn'ACX:AAa5AGAAACATXXXL\C^^ 

aagctaagggcrataacrcrggatacrattacagtggagaggatccagccaat 

ctgagagttgtgacatcacaatccccctatgtaatcgtggcracaaatgcratr 

gaatcaggagtgacactaccagantggacacggttatagacacggggttgaa 

atgtgaaaagagggtgagggtatcatcaaagatacccttcatcgtaacaggcc 

ttaagaggatggccgtgactgtgggtgagcaggcgcagcgtaggggcagagt 

aggtagagtgaaacccgggaggtattataggagccaggaaacagcaacaggg 

tcaaaggactaccactatgacctcttgcaggcacaaagatacgggattgagga *? 

tggaatcaacgtgaajaaatcctttagggaqatgaatracxaatrggagcxtata 25 

cgaggaggacagcctacraataacccagcnx3gaaatactaaataatctactcat a 

ctcagaaqacntgccagccgcixntaagaacataatggckiaggacngatcacc k 

cl^gagcxraatccaacttgcatacvyicagctatgaagtccaggtcccxs^ & 

tcccaaaaataaggaatggagaagtcacagacacctacgaaaattactcgtttc g 

taaatgccagaaagttaggggaggatgtgcccgtgtatatctacgctactgaa 

gatgaggatctggcagttgacctcttagggctagactggccrgatcctgggaa 

ccagcaggtagtggagactggtaaagcactgaagcaagtgaccgggttgtcct 

cggctgaaaatgccctactagtggcntatttgggtatgtgggttaccaggcrc 

tctcaaagaggcatgtccx:aatgataacagacatatataccatcgaggaccaga 

gactag/wvgacaccacccacctccagtatgcacccaacgccataaaaaccx3at 

gggacagagacroaacrgaaagaa(nx3gcgtcgggtgacgtggaaaaaatca 

TGGGAGCCATrrcAGATrATGCAOCTGOGGGACroGAGTrrGT^ 

GCAGAAAAGATAAAAACAGCKXHTrGTITAAAGAAAACGCAGAAGCCGCAAA 

AGGGTATGTCCAAAAATTCATrOACrCATrAATroAAAATAAAGAAGAAATAAT 

CAGATATGGTTTCTGGGGAACACACACAGCACTATACAAAAGCATAGCrGCAA 

GACTGGGGCATGAAACAGCGTITGCCACACTAGTGTTAAAGTGGCTAGCTTTT 

GGAGGGGAATCAGTGTCAGACCACGTCAAGCAGGCGGCAGTTGATTTAGTGG 

TCTATTATGTGATGAATAAGCCITCCTrCCCAGGTGACTCCGAGACACAGCAAG 

AAGGGAGGCGATTCGTCGCAAGCCTGTTCATCTCCGCACTGGCAACCrACACA 

TACAAAACTTGGAATTACCACAATCrCTCTAAAGTGGTGGAACCAGCCCTGGCr 

TACCTCCCCTATGCTACCAGCGCATTAAAAATGTTCACCCCAACGCGGCTGGAG 

AGCGTGGTGATACTGAGCACCACXlATATATAAAACATACCTCrCTATAAGGAAG 

GGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGCAGCCATGGAAATCC 

TGTCACAAAACCCAGTATCGGTAGGTATATCrGTGATGTTGGGGGTAGGGGCA 

ATCGCraCX}CACAAaKTATTQAGTXX:AGTGAACAGAAAAGGACaTAOT 

GAAGGTGTTTOTAAAGAACTirrTGOATCAGGCroCAACAGATGAGCTGG^ 
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AAGAAAACCCAGAAAAAATTATAATGGCCrTATTTGAACK:AGTC^ 

GTAACCCCCTGAGACrAATATACCACCTCTATGGGGTTTACTAC/ 

AGGCCAAGGAACTATCTGAGAGGACAGCAGGCAGAAACTTATTCACATTGATA 

ATGTTTGAAGCCTTCGA GTTA TTAGGGATGGACTCACAAGGGAAAATAA 

CCTGTCCGGAAATTACATTTTOGATTTGATATACGGCCTACACAAGC 

CAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGCACCCTTTAGTTGTG 

ACTGGACCCCTAGTGACGAGAGGATCAGATTGCCAACAGACAACTATTTGAGG 

GTAGAAACCAGGTGCCCATGTGGCTATGAGAT GAAA GCnrCAAAAATGTAGG 

TGGCAAACITACCAAAGTGGAGGAGAGCGGGCCnTCCTATGTAGAAACAGAC 

CTGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATTACGATGACAACCTC 

AGAGAGATAAAACrAGTAGCAAAGTIXKjAAGGACAGGTAGAGCACTACTAC/^ 

AGGGGTCACAGCAAAAATTGACTACAGTAAAGGAAAAATGCrcnTGC^ 

ACAAGTGGGAGGTGGAACATGGTXjTCATAACCAGGTTAGCTAAGAGATATACT 

GGGGTCGGGTTCAATGGTGCATACrrAGGTGACGAGCCCAATCACCGTGCTCT 

AGTGGAGAGGGACTGTGCAACTATAACCAAAAACACAGTACAGTTTCri^^ 

GAAGAAGGGGTGTGCGTTCACCTATGACCTGACCATCTCCAATCTGACCAGGC 

TCATCGAACTAGTACACAGGAACAATCTTGAAGAGAAGGAAATACCCACCGCT 

ACGGTCACCACATGGCTAGCTTACACCTTCGTGAATGAAGACGTAGGGACTAT 

AAAACCAGTACTAGGAOAGAGAGTAATCCCCGACCCTGTAQTTGATATCAATTT 

ACAACXAGAGGTGCAAGTGGACACGTCAGAGGTTGGGATCACAATAATTC^ 

GGGAAACCCTGATGACAACGGGAGTGACACCTGTCITGGAAAAAGTAGAGCCT 

GACGCCAGCGACAACCAAAACTCGGTGAAGATCGGGTTGGATGAGGGTAATTA 

CCCAGGGCCTGGAATACAGACACATAQ^CTAACAGAAGAAATACACAACAGGG 

ATGCGAGGCCCrrCATCATGATCCTGGGCTCAAGGAATTCCATATCAAATAGGG 

CAAAGACTGCTAGAAATATAAATCraXACACAGGAAATGACCCCAGGGAAAT^ 

CGAGACTTGATGGCrGCAGGGCGCATGITAGTAGTAGCACTGAGGGATGTCGA 

CCCTGAGCTGTCTGAAATGGTCGATrTCAAGGGGACl'ri'ri'rAGATAGGGAGG 

CCCTGGAGGCTCTAAGTCTCQGGCAACCTAAACCGAAGCAGGTTACa^ 

GCTGrrAGGAATTTGATAGAACAGAAAAAAGATGTGGAGATCCCTAA(^^ 

GCATCAGATGAOXAGTATITCTGGAAGTGGCCTrAAAAA^ 

TTAGTAGGAGATGTTOGAGAGCTAAAAGATCAAGCTAAAGCAC^ 

GGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGGACGTATGCCATGAAGC 

TATCTAGCrGGTTCCTCAAGGCATCAAACAAACAGATGAGTTTAACTrc^ 

TTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAGAGCAATAAGGGGCAC 

ATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAGCCCCTCGGTTGCGG 

GGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAGATACACCCATATGAAG 

CITACCrGAAGrK3AAAGATTTCATAGAAGAAGAAGAGAAGAAACCTA» 

AAGGATACAGTAATAAGAGAGCACAACAAATGGATACTTAAAAAAATAAGGm 

CAAGGAAACCTCAACACCAAGAAAATGCTCAACCXZGGGGAAACTATC^ 

GTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAACCACCAGATTGGTACT 

ATAATGTCAAGTGCAGGCATAAGGCTGGAGAAATTGCCAATAGTGAGGGCCCA 

AACCGACACCAAAACCTTTCATGAGGCAATAAGAGATAAGATAGACAAGAGTG 

AAAACCGGCAAAATCCAGAATTGCACAACAAATrGTTGGAGArm 

TAGCCCAACCCACCCTGAAACACACCTACGGTGAGGTGACGTGGGAGCAACTT 

GAGGCGGGGATAAATAGAAAGGGGGCAGCAGGCTTCCTGGAGAAGAAGAACA 

TCGGAGAAGTATTGGATTCAGAAAAGCACCTGGTAGAACAATTGGTCAGGGAT 

CrGAAGGC(XGGAGAAAGATAAAATATrATGAAACKX:AATACCA^ 

GAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGACCTGGTGGTTGAGAAG 

AGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAAGGCTAGCCATCACTAA 

GGTCATGTATAACTOGGTGAAACAGCAGCCCGTTGTGATTCCAGGAT^ 

GAAAGACCCCCTTGTrCAACATCTITGATAAAGTGAGAAAGGAATGG^ 

TCAATGAGCCAGTGGCCGTAAGTTTTGACACCAAAGCCTGGGACACT^ 

ACTAGTAAGGATCIXXIAACrrATnSGAGAAA 

GAGTGGCACAAGTTCATTGACACCATCACCGACCACATGACAGAAGTACC^ 
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TATAACAGCAGATGGTGAAGTATATATAAGAAATGGGCAGAGAGGGAGCGGC 

CAGa:AGACACAAGTG<nX3GCAAQ\GCATGTTAAATGTCCTGACAATGATGTA 

CGGCTTCTGCGAAAGCACAGGGGTACCGTACAAGAGTTTCAACAGGGTGGCAA 

GGATCCACGTCTGTGGGGATGATGGCIT(m'AATAACTG/iu\AAAGGGTTAGGG 

CTGAAATrrGCrAAC/>LAAGGGATGCAGATTCITCATGAAGCAGGCAAACCrCAG 

AAGATAACGGAAGGGGAAAAGATGAAAGTTGCCTATAGATTTGAGGATATAGA 

GTTCTGTTCTCATACCCCAGTCCXTGITAGGTGGTCCGACAACACCAGTAGTCA 

CATGGCCGGGAGAGACACCGCTGTGATACTATCAAAGATGGCAACAAGATTGG 

ATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAAAAGCGGTAGCCTTCAGT 

TTCTTGCrGATGTATTCCTGGAACCCGCTTGTTAGGAGGATTTGCCTGTTGGTC 

<nTTCX}CAACAGCCAGAGACAGACCCATCAAAACATGCCACrrATrATTACAAA 

GGTGATCCAATAGGGGCCrATAAAGATGTAATAGGTCGGAATCTAAGTGAACT 

GAAGAGAACAGGCTITGAGAAATIXjGCAAATCTAAACCTAAGCCTGTCCACGTT 

GGGGATCTGGACTAAGCACACAAGCAAAAGAATAATTCAGGACTGTGTTGCCA 

TTGGGAAAGAAGAGGGCAACTGGCTAGTTAACGCCGACAGGCTGATATCCAGC 

AAAACrcGCCAOTATACATACCTGATAAAGGCmACATrACAAGOAAAGCAT 

TATGAGCAACrGCAGCTAAGAACAGAGACAAACCCGGTCATGGGGGTTGGGA 

CTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCrcCTGAGAAGGTTGAAA 

ATIXrrGCrCATGACGGCCGTCGGCGTCAGCAGCnXjAaggttggggtaaacactocggn 

cmccttctttaatggtggctccatcttagcccmgtcacggctagcngtgaaaggtccgtgagcxgcatgactgcagagag^ 

ggcctctctgcagatcatgtCCCCCGGCCGTCGGCGTCAGCTGAgacaaaatgtatatattgtaaataaattaatc 

catgtacatagtgtatataaatatagttgggacx:gt(xacctcaagaagacgacacgcccaacacgcacagctaaacagtagtcaagatt 

atctacctc aagataac actacatttaatgcacacagcactttagctgtatgaggatacgcccgacgtctatagttggact^ 

ctaacagooccc 
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iH3Bfrag 
1.1.4 seq 

1.2.3 seq 
6.2.2 seq 

6.1.4 seq 



AAlgCIXXTp^TXSACX^GCCCTCX 

20 30 Ho 50 60 70 

AATTCTGCTCyVTGACQGCCGTCGK 7q 
AATTCTCCICATG^ 

AATTCTOCTO^ 70 



Bip&frag 
lil.4 seq 
l<j2.3 seq 
6<j2.2 seq 
6J1.4 seq 



TTTCxrc \:,uuTiTiU ' iuuuuiiiuu ' iTmTr - 



80 



-r 
90 



100 



I 

110 



120 



iTo 



140 



TTTCCI\>i-i-i-i-i-ii-i-i-i^ 140 

TTTcxri \:,rrixii ' iiiiiTmTmrm ' y ' to? 

TTTCCT tS T' mTl ' lTmTmTm ' I ' i '--- " * 00 

TTICCTGTTTTTT ^ 



BipBfrag 
4 seq 
3 seq 
6J2.2 seq 
64l.4 seq 



1. 
1.2 



ilo 



160 



^170 180 190 200 210 

. Cll^lccTXLTlTmnl xrx ^ l^,m ^ xLvxl^wl-Tlvm AA^ 140 

ClTl\,LlUCXTiriTi'CLl"Xl\.l"iU'XLLaUt.LlUV,n-XTAATG 125 



3¥, 3Bf rag 
1.4 seq 

3 seq 
2 seq 

4 seq 



1. 
1. 
6 

6. 



GTGCXrPCCATOTAGCCCry^TC^^ 

220 230 ?40 250 260 2?0 280 

CnGGCTCCAlxnTACSCCC^^ 2g0 

GTGGCTCCMCrpwXOT 219 

^^^^^J^S^S^tS^^ 212 

SSgSS^IE^^ 210 



3l^B£rag 
4: seq 
I • 3* seq 
1.2 seq 

k.4-*seq\. 



1. 
1. 
6 
6 



BH^Bfrag 
4 seq 

rrs-seq 

2 seq 
4 seq 



1- 
6 



290^^300 ,3io iio^ Ho ; alT" ySo 



TCftTOCTj^jvgjTOk^^ 280' 

360 370 380 390 400 

TGTAAATAAATTAAICCAIIXn'ACAI^^ 4Q2 

TGTi^TAAArrAATCCATCTACMT^^ 341 

TGTAAATAAArrMTCCATGTACATAGTCTATATAAATATA^^ 334 

TCTAAATAAATTAATCCATGTACATAGT^ 332 

TCTAAATAAATTAATCCMGO^^ 33^7 



FIGURE 20 
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Gtatacgagaattagaaaaggcactcgtatacgtattgggcaattaaaaataataattaggcctaggtacatggcacgtgccagccccct 

gatgggggcgacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgttagtatgag 

igtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgac 

cgggtcctttcttggataaacxxgctcaatgcctggagamgggcgtgcccccgcaagactgctagccgagtagtgttgggtcgcgaa 

aggccttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATGAGCACGAATC 

CTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGTCGCCCACAGGACGTC 

AAGTTCCCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAG 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAGCGGTCGCAA 

CCTCGAGGTAGACGTCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGGA 

CCTGGGCrCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGTTGCGGG 

TGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGGCCTAGCTGGGGCCCCAC 

AGACCCCCGGCX5TAGGTCGCGCAATrTGGGTAAGGTCATCGATACX:CITACXiT 

GCGGCTTCGCCGACCTCATGGGGTACATACCXXrrCXjTCGGCGCCCCTCTTGGA 

GGCGCTGCCAGGGCCCTGGCGCATGGCGT CCGG GTTCTGGAAGACGGCGTGA 

ACTATGCAACAGGGAACCITCCTGGTTGCKnTrCTCTATCTrC(nTCT^^ 

G CTCT CTTGCCTGACCGTGCCCGCTTCAGCCrACCAAGTGCGCAATTCCTCGGG 

GCrTTACCATGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGCGGC 

CGATGCCATCCTGCACACTCCGGGGTGTGTCCCTTGCGTTCGCGAGGGTAACG 

CCTCGAGGTGTTGGGTGGCGGTGACCCCCACGGTGGCCACCAGGGACGGCAA 

ACTCCCCACAACGCAGCTTCGACGTCATATCGATCrGCTTGTCGGGAGCGCCA 

CCCTCTGCTCGGCCCTCTACGTGGGGGACCTOTGCGGGTCTGTCri'1'Cl'lGTTG 

GrCAACTGTTTACCnTCTCTCCCAGGCGOCACTGGACGACGCAAOACTGCAATT 

GTTCTATCTATCCCGGCCATATAACGGGTCATCGCATGGCATGGGATATG^ 

TGAACTGGTCCCCTACGGCAGCGTTGGTGGTAGCTCAGCTGCTCCGGATCCCA 

CAAGCCATCATGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCAT 

AGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGC T 

TATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCG 

CACCACGGCTGGGCTTGTTGGTCTCCTrACACCAGGCGCCAAGCAGAACATCC h 

AACrGATCAACACCAACGGCAGTTGGCACATCAATAGCACGGCCTTGAACTGC OS 

AATGAAAGCCTTAACACCGGCrGGTrAGCAGGGCTCITCrATCAGCACAAATrC S 

AACTCTTCAGGCrGTCCTGAGAGGTrGGCCAGCTGCCGACGCCTTACCX3ATnT O 

GCCCAGGGCrGGGGTCCrATCAGTTATGCCAACGGAAGCGGCCrCGACGAAC fe 

GCCan'ACTGCTGGCACTAC(XT<XAAGACXnTGTGQCATT0TX3^ 

AGCXrraTGTOQCCCGQTATATTGCITCACnrCCAGCCrCQTGGT^ 

GA(X:GACAGGTCGGGCG<XS(XTACCTACAGCroGGGTGCAAATGATA<:GGAT 

GTCrrCGTCCTTAACAACACCAGGCCACCGCrGGGCAATTGGTTCGGTTGTACC 

TGGATGAACTCAACrGGATTCACCAAAGTGTGCGGAGCGCCCCCrTGTGTCAT 

CGGAGGGGTGGGCAACAACACCnTGCTCTGCCCCACTGATTGTn'CCGCAAGC 

ATCCGGAAGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATTACACCCAGG 

TGCATGGTCGACTACCCGTATAGGCnTGGCACTATCCTTGTACCATCAATTAC 

ACCATATTCAAAGTCAGGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAG 

CGGCCTGCAACTGGACGCGGGGCGAACGCTGTGATCrGGAAGACAGGGACAG 

GTCCGAGCTCAGCCCATTGCTGCTGTCCACCACACAGTGGCAGGTCCTTCCGT 

GTlXnTTCAOSACCCTGCOVGCCTKntXT^CCG^ 

acattgtggacgtgcagtacttgtacggggtagggtcaagcatcgcgtcctgg 
gccattaagtgggagtacgtcgttctcctgttcctcctgcrtgcagacgcgcgc 
gtctgctcctgcttgtggatgatgttactcatatcccaagcggaggcggctttg 
gagaacctcgtaatactcaatgcagcatccctggccgggacxx:acggtcttgt 

GTCCTTCCTCGTGTTCnTCrGCTrTGCGTGGTATCTGAAGGGTAGGTGGGTGCC 
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CGGAGCGGTC7rACCK:CTTCTACGGGAAGTGGGTCITACTCnTA 

AGTGGTACACCCAATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGT 

GGTAAAGGCCGATTCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCTCTGT^ 

TTACAACAGTAGTACTAATCGTCATAGGTTTAATCATAGCTAGGCGTGACCCAA 

CTATAGTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACC 

CACCAGCCTGGAGTTGACAT CGCrG TGGCGGTCATGACTATAACCCTACTGAT 

GGTTAGCTATGTGACAGATTATmAGATATAAAAAATGGTTACAGTGCATrcrC 

AGCCTGGTATCTGCGGTGTTCITGATAAGAAGCCTAATATACCT^ 

GAGATGCCAGAGGTAACTATCCCAAACTGGAGACCACTAACTrrAATACTAT^^ 

TATTTGATCTCAACAACAATrGTAACGAGGTGGAAGGTTGACGTGGCTGGCCrA 

TTGTTGCAATGTGTGCCTATCTTATTGCTGGTCACAACCTTGTGGGCCGACrrCT 

TAACCCTAATACTGATCCTGCCTACCTATGAATTGGTTAAATTATACTATCTC 

AACTGTTAGGACTGATACAGAAAGAAGITGGCTAGGGGGGATAGACrATACAA 

GAGTTGACTCCATCTACGACGTTGATGAGAGTGGAGAGGGCGTATATCTTTTTC 

CATCAAGGCAGAAAGCACAGGGGAATTTTTCTATACTCTTGCCCOT 

CAA CACT GA TAAGT TGCGTCAGCAGTAAATGGCAGCTAATATACATGAGITACT 

TAACTTrGGACmATGTACTACATGCACAGGAAAGTTATAGAAGAGATC^^ 

GAGGTACCAACATAATATCCAGGTTAGTGGCAGCACrCATAGAGCTGAACTX^ 

TCCATGGAAGAAGAGGAGAGCAAAGGCTTAAAGAACjITITATCT 

AAGGTTGAGAAACCTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCTT 

CTTGGTACGGGGAGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATC 

AAGGCCAGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGA 

GGGCCGAGAGTGGAAAGGTGGCACCTGCCCAAAATGTGGACGCCATGGGAAG 

CCG ATAA CGTGTGGGATGTCG CTAG CAGATTTTGAAGAAAGACACTATAAAAG 

AATCnTrATAAGGGAAGGCAACTrrGAGGGTATGTGCAGCCGATGCCAG^ 

AGCATAGGAGGTTTGAAATGGACCGGGAACCTAAGAGTGCCAGATACTGTGCT 

GAGTGTAATAGGCTGCATCCTGCT GAGG AAGGTGACnTITGGGCAGA 

C\TGTTGGGCCTCAAAATCACCTACTITGCGCTGATGGAT^ 

TATCACAGAGTGGGCTGGATGCCAGCGTGTGGGAATCTCCCCAGATACCCACA 

GAGT CCCr rGTCACATCTCATTTGGTTCACGGATGCCTITCAGGCAG^ 

ATGGCTTTGTACAATATACCGCTAGGGGGCAACTATTTCrGAGA^ 

TACTGGCAACTAAAGTAAAAATGCTCATGGTAGGCAACCITGGAGAAGA^ 

GGTAATCTGGAACATCITGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAGAA 

GATCACAGAGCACGAAAAATGCCACATTAATATACTGGATAAACrAACCGCATT 

TTTCGGGATCATGCCAAGGGGGACTACACCCAGAGCCCCGGTGAGGTTCCCTA 

CGAGCirACTAAAAGTGAGGAGGGGTCTGGAGACTGCCTGGGCTTACACACAC 

CAAGGCGGGATAAGTTCAQTCXjACCATGTAACCGCCGGAAAAGATCTACTGGT 

CTGTGACAGCATGGGACGAACTAGAGTGGTTTGCCAAAGCAAC^ 

CCGATGAGACAGAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGC 

CAGATGTTATGTGTTAAATCCAOAGGCCGTTAACATATCAGGATCCAAAGGGG 

CAGTCGTTCACCTCCAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCA 

GGCACACCGGCmCTTCGACCTAAAAAACTrGAAAGGATGGTCAGGCT^ 

ATATTTGAAGCCTCCAGCGGGAGGGTGGTTGGCAGAGTCAAAGTAGGGAAGA 

ATGAAGAGTCrAAACCrACAAAAATAATGAGTGGAATCCAGACOTTCTCAAAA^ 

ACAGAGCAGACCTGACCGAGATGGTCAAGAAGATAACCAGCATGAACAGGGG 

AGACrrCAAGCAGATTACTTTGGCAACAGGGGCAGGCAAAACCACAGAACrCC 

CAAAAGCAGTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATA 

CCATTAAG GGCAG CGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCC 

AAGCATCTCrmAACCTAAGGATAGGGGACT^TGAAAGAGQGGGACATGG^^ 

CCXXjGATAACCTATGCATCATACGGGTACTICTGCCAAATGC^ 

TCAGAGCTGCrATGGTAGAATA<nx:ATACATATTCTTAGATGAATACC^ 

CACTCCTGAACAACrGGCAATTATCGGGAAGATCCACAGATTTTCA^^ 

AAGGGTTGTCGCCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGT 

CAAAAGCACCCAATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGG 
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ATCTTGGTAGTCAGTTC CTTGATA TAGCAGGGTTAAAAATACCACTC^ 

TGAAAGGCAATATGTTGGTTTTTGTACCAACGAGAAACATGGCAGTAGAGGT^ 

GCAAAGAAGCTAAAAGCTAAGGGCTATAACTCTGGATACTATTACAGTC^ 

GGATCCAGCCAATCTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGC 

TACAAATGCTATTGAATCAGGAGTGACACTACCAGATTTGGACACGGTTATAGA 

CACGGGGTTGAAATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCTTCA 

TCGTAACAGGCCTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCG 

TAGGGGCAGAGTAGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAA 

ACAGCAACAGGGTCAAAGGACTACCACTATGACCrCTTGCAGGCACAAAGATA 

CGGGATTGAGGATGGAATCAACGTGACGAAATCCTTTAGGGAGATGAATTACG 

ATTGGAGCCTATACGAGGAGGACAGCCTACTAATAACCCAGCrGG/WVATACTA 

AATAATCTACTCATCrCAGAAGACITGCCAGCCGCrGTTAAGAACATAATGGCC 

AGGACTGATCACCCAGAGCCAATCCAACTTGCATACAACAGCTATGAAGTCCA 

GGTCCCGGTCCTATTCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACG 

AAAATTACTCGTTTCTAAATGCCAGAAAGTTAGGGGAGGATGTGCCCGTGTAT^ 

TCTACGCTACTGAAGATGAGGATCTGGCAGTTGACCTCTTAGGGCTAGACrGG 

CCTGATCCTGGGAACCAGCAGGTAGTGGAGACTGGTAAAGCACTGAAGCAAGT 

GACCGGGTTGTCCTCGGCTGAAAATGCCXn'ACTAGTGGCnTrAm 

GQGTTACC:L\GGCrcrcrCAAAGAGGCATGTCCCAATGATAACA 

CATCGAGGACCAGAGACTAGAAQACACCACCCACCTCCAGTATGCACCCAACG 

CCATAAAAACCGATGGGACAGAGACTGAACTGAAAGAACTGGCGTCGGGTGA 

CGTGGAAAAAATCATGGGAGCCATTTCAGATTATGCAGCrGGGGGACTGGAGT 

TTGTTAAATCXXAAGCAGAAAAGATAAAAACAGCTCCTITGTTT/^ 

CAGAAGCCGCAAAAGGGTATG rCCA AAAATTCATTGACrCATTAATTGAAAATA 

AAGAAGAAATAATCAGATATGGTTTGTGGGGAACACACACAGCACTATACAAA 

AGCATAGC TGCAA GACnXjGGGCATGAAACAGCGTTTGCCACACTAGTGTC 

GTGGCTAGCTTTTGGAGGGGAATCAGTGTCAGACCACGTCAAGCAGGCGQCA 

GTTGATTTAGTGGTCTATTATGTGATGAATAAGCCITCCr^ 

GAGACACAGCAAGAAGGGAGGCGATTCGTCGCAAGCCTGTTCATCrCCGCACT 

GGCAACCTACACATACAAAACITGGAATTACCACAATCTCTCTAAAG^ 

ACCAGCCCTGGCTTACCTCCCCTATGCrACCAGCGCATTAAAAATGTTCACCCC 

AACGCGGCTGGAGAGCGTGGTGATACTGAGCACCACGATATATAAAACATACC 

TCrCTATAAGGAAGGGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGC 

AGCCATGGAAATCCTGTCACAAAACCCAGTATCGGTAGGTATATCTGTGATGTT 

GGGGGTAGGGGCAATCGCTGCGCACAACGCTATTGAGTCCAGTGAACAGAAA 

AGGACCCTACTTATGAAGGTGTITGTAAAGAACri'ClU'GGATCAGGCT 

GATGAGCrGGTAAAAGAAAACCCAGAAAAAATTATAATGGCCTrATITGAAC^ 

GTCCAGACAATTGGTAACCCCCTGAGACTAATATACrACCTGTATGGGGm 

TACAAAGGTTGGGAGGCCAAGGAACTATCTGAGAGGACAC^ 

TATTCACATTGATAATGTITGAAGiXTrCGAGTTATrAGW 

GGAAAATAAGGAACCrGTCCGGAAATTACATTTTGGATTTGATAT^ 

ACAAGCAAATCAACAGAGGGCTOAAGAAAATGG 

ACCCTTTAGTTGTGACrGGACCCCTAGTGACGAGAGGATCAGATrGCCAACAG 

ACAACTATTTGAGGGTAGAAACCAGGTGCCCATGTGGCTATGAGATGAAAGCr 

TTCAAAAATGTAGGTGGCAAACTTACCAAAGTGGAGGAGAGCGGGCCm 

ATGTAGAAACAGACCTGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATr 

ACGATGACAACCrCAGAGAGATAAAACCAGTAGCAAAGTTGGAAGGACAGGTA 

GAGCACTACTACAAAGGGGTCACAGCAAAAATTGACTACAGTAAAGGAAA^ 

GCrCTTGGCCACTGACAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAG 

CTAAGAGATATACTGGGGTCGGGTTCAATGGTGCATACTTAGGTGACGAGCCC 

AATCACCGTGCICTAGTGGAGAGGGACTGTGCAACTATAACCAAAAACAC^ 

ACAGTTTCTAAAAATGAAGAAGGGGTGTGCGTTCACCTATGACCTGACCATCTC 

CAATCTGACCAGGCTCATCGAACTAGTACACAGGAACAATCTTGAAGAGAAGG 

AAATACCCA<XGCrACOGTCACXL\CATGGCrAGCITACACCTTCGTC 
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acgtagggactataaaaccagtactaggagagagagtaatccccgaccctgta 

gttgatatcaatttacaaccagaggtgcaagtggacacgtcagaggttgggat 

cacaataattggaagggaaaccctgatgacaacgggagtgacacctgtcttgg 

aaaaagtagagcctgacgccagcgacaaccaaaactcggtgaagatcgggttg 

gatgagggtaattacccagggcctggaatacagacacatacactaacagaaga 

aatacacaacagggatgcgaggcccttcatcatgatcctgggctcaaggaatt 

ccatatcaaatagggcaaagacrcctagaaatataaatctgtacacagga^ 

accccagggaaatacgagacttgatggctgcagggcgcatgttagtagt agca 

ctgagggatgtcgaccctgagctgtctgaaatggtcgatttcaaggggact^ 

tttagatagggaggccctggaggcictaagtctcgggcaacct 

aggttaccaaggaagctgttaggaatitgatagaacagaaaaaagatgtgga 

atccctaactggttracatcagatgacccagtatttctggaagtggccrraa/^ 

aatgataagtactacttagtaggagatgttggagagctaaaagatcaagcraa^ 

gcacttggggccacggatcagacaagaattataaaggaggtaggctcaagga 

cgtatgccatgaagctatctagctggttccrcaaggcatcaaacaaacag 

gtiraactccactgtttgaggaattgttgcracggtgcccacctgcaact 

gcaataaggggcac7vtggcatcagctta(xaattggcacagggtaact 

cccctcggttgcggggtgcacctaggtacaataccagccagaagggtgaagat 

acaccxl^tatgaagcrracctgaagtrgaaagatttcatag^ 

gaaacctagggttaaggatacagtaataagagagcacaacaaat^ 

aaaaaataaggtttcaaggaaacctcaacaccaagaaaatgctcaacc^ 

aaactatctgaacagttggacagggaggggcgcaagaggaacatctacaacca 

ccagattggtactataatgtcaagtgca ggcat aaggctggagaaattgccaa 

tagtgagggcccaaaccgacacxzaaaaccrrrcatgaggcaataagagataag 

atagacaagagtgaaaaccggcaaaatcx::agaattgcacaac^^ 

gatntccacacgatagcccaacccaccctgaaacacacctacggtgaggtga 

CGTGGGAGCAACrrcAGGCGGGGGTAAATAGAAAGGGGGCAGCAGGCrrCCT 

GGAGAAGAAGAACATCGGAGAAGTArrGGATTCAGAAAAGCACCnX3GTAGAAC 

AATTGGTCAGGGATCTGAAGGOCGGGAGAAAGATAAAATATrATGAAAC^^ 

ATACCAAAAAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGACC 

TGGTGGTTGAGAAGAGGCCAAGAGTTATCCAATACXXn'GAAGCCAAGACAAGG 

CTAGCCATCACTAAGGTCATGTATAACTGGGTGAAACAGCAGCCCGTrGTGATr 

CCAGGATATGAAGGAAAGACCCCCnTGTTCAACATCnTC 

GAATGGGACrCGTTCAATGAGCCAGTGGCCGTAAGTTTTGACACCAAAGCCTG 

ggacactcaagtgactagtaaggatctgcaacitatrggagaaatc^ 

itacrataagaaggagtggcacaagtrcattgacaccatcaccxiaccacatgac 

agaagtacx:agttataacagcagatggtgaagtatatataagaaatgggcaga 

gagggagcggccagccagacacaagtgctggcaacagcatgttaaatgtcct 

gacaatgatgtacggcrrcrgcgaaagcacaggggtaccgtacaagagtitca 

acagggtggcaaggatccacgtctgrixxxsgatgat^^ 

aaagggttagggctgaaatttgctaacaaagggatgcagattc^^ 

AGGCAAACCTCAGAAGATAACGGAAGGGGAAAAGATGAAAGTTGCCT 

TTGAGGATATAGAGTTCTGTTCTCATACCCCAGTCCCTGTTAGGTGGTCCGACA 

ACACCAGTAGTCACATGGCCGGGAGAGACACCGCTGTGATACTATCAAAGATG 

GCAACAAGATTGGATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAAAAGC 

GGTAGCCTTCAGTTTCTTGCTGATGTATTCCrGGAACCCGCTTGTT^ 

TTGCCTGrrGGTCCTTTCGCAACAGCCAGAGACAGACCCATCAAAACATGCCAC 

TTATTATTACAAAGGTGATCCAATA GGGG CCTATAAAGATGTAATAGGTCGGAA 

TCTAAGTGAACTGAAGAGAACAGGCmGAGAAATTGGCAAATCTAAACCT 

(XTGTCCACGTTGGGGGTCTGGACTAAGCACACAAGCAAAAGAATAATO 

ACTCTGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTAA C^ 

CTGATATCCAGCAAAACTGGCCACTrATACATACCTO 

AAGGAAAGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCX^ 

GGGGTTOGGACTGAGAGATACAAGTTAGGT«:CATAGTCAATCTGCrGCTGAG 
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AAGGTTGAAAATTCTGCTCATGACGGCCGTCGGCGTCAGCAGCrGAgacaaaatgtat 
atattgtaaataMCtaatccatgtacatag^gtatataaatatagngggaccgtccacctcaagaagacgacac 
ctaaacagtagtcaagattatctacctcaagataacacUcatttaatgcacacagcactttagctgtatgagga^ 
ttggactagggaagacctctaacagccccc 



in 
I 



wo 99/55966 



PCT/US99/08850 



55/67 




wo 99/55366 



PCT/US99/088S0 



56/67 



Gtatacgagaattagaaaaggcactogtatacgtattgggcaattaaaaataataattaggcctaggtacatggcacgtgccagcccro 

gatgggggcgacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgttagtatgag 

tgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgac 

cgggtcctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaagactgctagccgagtagtgttgggtcgcgaa 

aggccttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATGAGCACGAATC 

CTAAACCTCAAAGAAAAACCAAACGTAACACCAACXXjTCGCCCACAGGACXjTC 

AAGTTCCCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAG 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAGCGGTCGCAA 

CCTCGAGGTAGACGTCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGGA 

CCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGTTGCGGG 

TGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGGCCTAGCTGGGGCCCCAC 

AGAaraXKKXn-AGGTCGCGCAATTTGGGTAAGGTCATCGATACCCTTACGT 

GCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTTGGA 

GGCGCTGCCAGGGCCCTGGCGCATGGCGT CCGG GTTCTGGAAGACGGCGTGA 

ACrATGCAACAGGGAACCITCCTGGTrGCrCTTTCTCTATCTr<XTTCTGGCCCT 

GCTCTCTTGCCTGACCGTGCCCGCTTCAGCCTACCAAGTGCXjCAATrCCTCOGG 

GCnTACCATGTCACCAATGATTGCCCTAACTCGAGTA'TTGTGTACGAGGCGGC 

CXjATGCCATCCTGCACACTCCGGGGTGTGTCCCTTGCGTrCGCXjAGGGTAACG 

CCrCGAGGTGTTGGGTGGCGGTGACCCCCACGGTGGCCACCAGGGACGGCAA 

ACTCCCCACAACGCAGCTTCGACGTCATATCGATCTGCTTGTCGGGAGCGCCA 

CCCTCTGCrCGGCCCTCTACGTGGGGGACCTGTGCGGGTCTGTCTTTCTTGTTG 

GTCAACraTTTACCITCTCTCXrCAGGCGCCACTGGACGAaJCAAGACT^^ *7 

GTTCTATCTATCCCGGCCATATAACGGGTCATCGCATGGCATGGGATATGATGA 3 

TGAACTGGTCCCCTACGGCAGCGTTGGTGGTAGCTCAGCTGCrCCGGATCCCA " 

CAAGCCATCATGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCAT S 

AGCGTAnTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGC & 

TATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCG O 

CACCACGGCTGGGCTTGTTGGTCrCCTTACACCAGGCGCCAAGCAGAACATCC g 

AACIX3ATCAACACCAACGGCAGTTGGCACATCAATAGCACGGCCTTGAACTGC 

AATGAAAGCCTTAACACCGGCTGGTTAGCAGGGCrCTTCTATCAGCACAA ATTC 

AACTCTTCAGGCnXjTCCroAGAGGTTGGCCAGCTGCCGACGCCTTACCGArnT 

GCCCAGGGCTGGGGTCCrATCAGTTATGCCAACGGAAGCGGCCTCGACGAAC 

GCCCCTACrGCTGGCACTACCCTCCAAGACCnTGTGGC\TTGTGCCCGCAA^ 

AGCGTGTGTGGCCCXiGTATATroCTrCACTCCXAGCCCCXrrGGTGGTGGGAAC 

GACCGACAGGTCGGGCGCGCCTACCTACAGCTGGGGTGCAAATGATACGGAT 

GTCITOjTCCITAACAACACCAGGCCACCXKnXKKjCAATTC^^ 

TGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCCCCTTGTGTCAT 

CGGAGGGGTGGGCAACAACACCTrGCTCrGCCCCACTGATTGTTTCCGCAAGC 

ATCCGGAAGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATTACACCCAGG 

TGCATGGTCGACTACCCGTATAGGCnTGGCACTATCCITGTACCATCAATrAC 

ACCATATTCAAAGTCAGGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAG 

CGGCCTGCAACTGGACGCGGGGCGAACGCTGTGATCTGGAAGACAGGGACAG 

GTCCGAGCTCAGCCCATTGCTGCrGTCCACCACACAGTGGCAGGTCCTrCCGT 

GTTCTITCACGACCCTGCCAGCXnTGTCCACCGGCCTCATCCACCTCCACCAGA 

ACATTGTGGACGTGCAGTACTTGTACGGGGTAQQQTCAAGCATCGCGTCCTGG 

GCCATTAAGTOGGAGTACGTCGTrCrCCroTTCCTCCTGCTTGCAGACG C^ 

GTCIXXrrCCTGCTTGTGGATGATGTTACTCATATCXZCAAGCGGAGGCGGCTTTC 

GAGAACCTCGTAATACTCAATGCAGCATCC(nX3GCCX3GGACGCAaKnxn^ 
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GTCCTTCCrCGTGTTCTTCTGCTTTGCGTGGTATCTGAAGGGTAGGTGGGTGCC 

CGGAGCGGTCTACGCCTTCTACGGGAAGTGGGTCTTACTCTTATACCACATCrT 

AGTGGTACACCCAATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGT 

GGTAAAGGCCGATTCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCTCTGTT 

TTACAACAGTAGTACTAATCGTCATAGGTTTAATCATAGCTAGGCGTGACCCAA 

CTATAGTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACC 

cacx:agcctggagttgacat cgctg tggcggtcatgactataaccctactgat 

GGTTAGCTATGTGACAGATTATTTTAGATATAAAAAATGGTTACAGTGCATTCTC 

AGCCTGGTATCKK^jGTGTTCTTGATAAGAAGCCTAATATACCT 

GAGATGCCAGAGGTAACTATCa:AAACIX3GAGACCACTAACTrTAATACrATTA 

TATTTGATCrCAACAACAATTGTAACGAGGTGGAAGGTTGACGTGGCrGGCCTA 

TTGTTGCAATGTGTGCCTATCTTArrGCTGGTCACAACCITGTGQGCeGACTTCT 

TAACCXTAATACTGATCCTGCCTACCTATGAATTGGTrAAATTATACTATCTGA^ 

AACTGTTAGGACrGATACAGAAAGAAGTTGGCTAGGGGGGATAGACTATACAA 

GAGTTGACTCCATCTACGACGTTGA TGAGAG TGGAGAGGGCGTATATCl'l'i'riC 

CATCAAGGCAGAAAGCACAGGGGAATITITCTATACTCmXjCCCCTrATC^ 

CAACACTGAT AAGT KXXjTCAGCAGTAAATGGCAGCTAATATACATGAGTTACT 

TAAtnTrGGACTTTATGTACTACATGCACAGGAAAGTrATAGAAGAGATCT^ 

GAGGTACXAACATAATATCCAGGTTAGTGGCAGCACnXIATAGAGCrGAACT^ 

TCTATGGAAGAAGAGGAGAGCAAAGGCTrAAAGAAGTTrrATCTATTaTCTGG 

AAGGTKJAGAAACXTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCrr 

CrrGGTACGGGGAGGAGGAAGTCTACX3GTATGCX:AAAGATCATGACrATAATC 

AAGGCCAGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGA 

GGGCCGAGAGTGGAAAGGTGGCACCTGCCCAAAATGTGGACGCCATGGGAAG 4 

CCCATAACGTGTGGGATGTOK n'AGC AGArnTGAAGAAAGACACrATAAAAG <^ 

AATCTITATAAGGGAAGGCAACITTGAGGGTATGTGCAGCCGATGCCAGGGAA M 

AGCATAGGAGGTTTGAAATGGACCXjGGAACCTAAGAGTGCCAGATACTGTGCT « 

GAGTGTAATAGGCKKATCXTGCT GAGG AAGGTGACmTGGGCAGAGTCGAG R 

CATGTTGGGCCTCAAAATCA<XTACITraCX3CTGATGGATGGAAAGG^ S 

TATCACAGAGTGGGCTGGATGCCAGCGTGTGGGAATCrCCCCAGATACCCACA ^ 

GAGTCCCTTGTCACATCTCATTKXSTTCACGGATGCCrTTCAGGCAGGAATACA 

AIGGCirrcTACAATATACCGCrAGGGGGCAACrATTTCrGAGAAACITGCCCG 

TACTGGCAACTAAAGTAAAAATGCrCATGGTAGGCAACCTTGGAGAAGAAATr 

GGTAATCTGGAACATCTTGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAGAA 

GATCACAGAGCACGAAAAATGCCACATTAATATACTGGATAAACTAACCGCATT 

TTTCGGGATCATGCCAAGGGGGACTACACCCAGAGa:CCGGTGAGGTTCCCTA 

CGAGCTTACrAAAAOTGAGGAGGGGTCrGGAGACrGCCrGGGCITACACACAC 

CAAGGCGGGATAAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGT 

CTGTOACAGCATGGGAa}AACTAOAGTGGTrrG<X:AAAGCAACAACAGGTTGA 

OXSATGAGACAGAGTATGGCGTCAAGACTGACTCAGGOTGCCCAGACGGTGC 

CAGATGTTATGTGTTAAATa7k.GAGGCCGTrAACATATCAGGATCa\^ 

CAGTCGTTCACCTCCAAAAGACAGGrGGAGAATTCACGTGTGTCACXZG^ 

GGCACACCGGCllUtJi'lCGACCTAAAAAACTIX}AAAGGATGGTCAGGCTrGCCr 

ATATTTGAAGCCTCCAGCXSGGAGGGTGGTrGGCAGAGTCAAAGTAGGGAAGA 

ATGAAGAGTCTAAACXTACAAAAATAATGAGTGGAATCXIAGACCGTCTCAAAAA 

ACAGAGCAGACCrGACCGAGATGGTCAAGAAGATAACX:AGCATGAACAGGGG 

AGACrrCAAGCAGATTACTTTGGCAACAGGGGCAGGCAAAACCACAGAACTCC 

CAAAAGCAGTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCrTATA 

CCATTAAGGGCAGajGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCC 

AAGCATCrCmTAACXTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAA 

(XGGGATAACXTATGCATCATACGGGTACTTCnXjCXIAAATGCXTC^ 

TCAGAGCTGCTATGGTAGAATACrCATACATATTCTTAGATGAATACCATnyrGC 

CACTCCTGAACAACnXXKTU^TTATCGGGAAGATaZACAGATTT^ 

AAGGGTTGTCGCCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGT 
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CAAAAGCACCCAATAGAGGAATTCATAGCCCCCGAGGTAATGAAACXjCXjAGG 

ATCTTGGTAGTCAGTTCC ITGATA TAGCAGGGTTAAAAATACCAGTGGATGAGA 

TGAAAGGCAATATGTTGGTTTTTGTACCAACGAGAAACATGGCAGTAGAGGTA 

GCAAAGAAGCrAAAAGCTAAGGGCrATAACTCrGGATACrATTACAGTGGAGA 

GGATCCAGCCAATCTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGC 

TACAAATGCTATTGAATCAGGAGTGACACTACCAGAnTGGACACGGTTATAGA 

CACXjGGGTKSAAATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCTTCA 

TCGTAACAGGCCTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCG 

TAGGGGCAGAGTAGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAA 

ACAGC7>u\CAGGGTCAAAGGACTACCACTATGACCTCTTGCAGGCACAAAGATA 

CGGGATTGAGGATGGAATCAACGTGACGAAATCCTTTAGGGAGATGAATTACG 

ATTGGAGCXTATACGAGGAGGACAGCCTACTAATAACCCAGCTGGAAATACTA 

AATAATCTACTCATCTCAGAAGACITGCCAGCCXjCTGTrAAGAACATAATGGCC 

AGGACTGATCACCCAGAGCCAATCCAACTTGCATACAACAGCTATGAAGTCCA 

GGTCCCGGTCCTGTTCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACX} 

AAAATTACrCGTrTCrAAATGCCAGAAAGTTAGGGGAGGATGTGCCCGTGTATA 

TCTAOjCrACTGAAGATGAGGATCnXKjCAGTTGACCTCTrAGGGCTAGACT 

(XTGATCCrGGGAACCAGCAGGTAGTGGAGACTGGTAAAGCACTGAAGCAAGT 

GACCGGGTTGTCXnXXX3CTGAAAATG<XXTACTAGTGGCmAT^ 

GGGTTACXIAGGCTCrcnXIAAAGAGGCATGTCrcAATGATAACAOACATATAT^^ 

CATCGAGGACCAGAGACTAGAAGACACCACCCACCTCCAGTATGCACCCAACG 

CCATAAAAACCGATGGGACAGAGACTGAACTGAAAGAACTGGCGTCGGGTGA 

CGTGGAAAAAATCATGGGAGCCATTTCAGATTATGCAGCTGGGGGACTGGAGT 

TTGTTAAAT(XCAAGCAGAAAAGATAAAAACAGCIXXnTKnTrAAAGAAAA<:^ 

CAGAAG<XGCAAAAGGGTATGTCCAAAAATTCATTGACnXL\TrAATroAAAATA 

AAGAAGAAATAATCAGATATGGTTTGTGGGGAACACACACAGCACTATACAAA 

AGCATAGCTGCAAGACrGGGGCATGAAACAGCGTrrGaL\CACrAGTGTTAAA 

GTGGCTAGCnTTGGAGQQGAATCAGTGTCAQACCACXiTCAAGCAGGCGGCA ^ 

GTTGATTrAGTGGTCTATTATGTGATGAATAAGCCrrCXTrCXX:^ ^ 

GAGACACAGCAAGAAGGGAGGCGATTCGTCGCAAGCXrraTTCATCrCCGCACT g 

GGCL<W>kCCTACACATACAAAA(nTGGAATTACCACAATCIXJI^ P 

ACCAGCCCTGGCrrACCrCCCCrATGCrACCAGCGCATTAAAAATGTTCACCCC Q 

AACGCGGCTGGAGAGCGTGGTGATACTGAGCACCACGATATATAAAACATACC S 

TCTCTATAAGGAAGGGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGC 

AGCCATGGAAATCCTGTCACAAAACCCAGTATCGGTAGGTATATCTGTGATGTT 

GGGGGTAGQGGCAATCGCrGCGCACAACGCTATTGAGTCCAGTGAACAGAAA 

AGGACCCTACTTATGAAGGTGTTTGTAAAGAACITCTTGGATCAGGCIXX: 

GATGAGCKX3TAAAAGAAAA(XCAGAAAAAATTATAATGGCCTrATTTC 

GTCCAGACAATTGGTAACCCCXnXjAGACTAATATACXIACCrGTATGGQGT^ 

TACAAAGGTTGGGAGGCCAAGGAACTATCTGAGAGGACAGCAQGCAGAAACT 

TATTCACATTGATAATGTTrGAAGCCTTCGAGnATTAGGGATTC 

GGAAAATAAGGAACCTGTCCGGAAATTACATmXXJATTroATATAa^^ 

ACAAGCAAATCAACAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGC 

ACC(nTrAGTrGTGACTGGACCCCrAGTGACGAGAGGATCAGATrGCCAACAG 

aq\actatttgagggtagaaaccaggtgcccatgtggctatgagat gaaag ct 

ttcaaaaatgtaggtggcaaacitaccaaagtggaggagagcgggccntcct 

atgtagaaacagacctggtaggggacx:agtcaactacagagtcaccaagtatt 

acgatgacaacctcagagagataaaaccagtagcaaagttggaaggacaggta 

gagcactactacaaaggggtcacagcaaaaatrgactacagtaaaggaaaaat 

gctctrggaiactgacaagtgggaggtggaacatggtqtcataaccaggttag 

ctaagagatatactggggt(xk3gtrcaatggt<k:atactraggtgacgagccc 

aatcaccxtgcictagrggagagggactgrgcaactataacxlaaaaaq^^ 

ACAGTTTCTAAAAATGAAGAAQGGCnXjTGCXnTCAOCTATGACCTGArc 
CAATCnX3ACOiGGCKL\TCGAACrAGTACACAGGAACAATCITG^ 
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AAATACCCACCXjCTACGGTCACCACATGGCTAGCTTACACCTrCG^^ 

ACGTAGGGACTATAAAACCAGTACTAGGAGAGAGAGTAATCCCCGACCCTGTA 

GTTGATATCAATTTACAACCAGAGGTGCAAGTGGACACGTCAGAGGTTGGGAT 

CACAATAATTGGAAGGGAAACCCTGATGACAACGGGAGTGACACCTGTCTTGG 

AAAAAGTAGAGCCTGACGCCAGCGACAACCAAAACTCGGTGAAGATCGGGTTG 

GATGAGGGTAATTACCCAGGGCCTGGAATACAGACACATACACTAACAGAAGA 

AATACACAACAGGGATGCGAGGCCCTTCATCATGATCCTGGGCTCAAGGAATT 

CCATATCAAATAGGGCAAAGACrGCTAGAAATATAAATCTGTACACAGGAAA^^ 

ACCCCAGGGAAATACGAGACTTGATGGCTGCAGGGCGCATGTTAGTAGTAGC 

CTGAGGGATGTCGACCCTGAGCTGTCTGAAATGGTCGATTTCAAGGGGACTTT 

TTTAGATAGGGAGGCCCTGGAGGCTCTAAGTCTCGGGCAACCTAAACCGAAGC 

AGGTTACCAAGGAAGCTGTTAGGAATTTGATAGAACAGAAAAAAGATGTGGAG 

ATCCCTAACTGGTrrGCATCAGATGACCCAGTATrrCTGGAAGTGGCCI^ 

AATGATAAGTACTACTTAGTAGGAGATGTTGGAGAGGTAAAAGATCAAGCTAA 

AGCACnTGGGGCCACGGATCAGACAAGAATTATAAAGGAGGTAGGCrCAAGG 

ACGTATGCCATGAAGCTATCTAGCrGGrrCCTCAAGGCATCAAAC^^ 

ACnTTAACTCCACTGTTTGAGGAArrCrITGCrA(^^ 

AGCAATAAGGGGCACATGGCATCAGCITACCAATTGGCACAGGGTAACI^^ 

GCCCCTCQGTTGCQGQQTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAG 

ATACACXXlATATGAAGCITACCraAAGTTGAAAGATTTCATAGi^^ 

AAGAAACCTAGGGTTAAGGATACACn'AATAAGAGAGCACAACAAATGGATACT 

TAAAAAAATAAGGTTTCAAGGAAACCTCAACACCAAGAA^ 

GAAACTATCrGAACAGTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAAC 

CACCAGATTGGTACTATAATGTCAAGTGCA GGCAT AAGGCTGGAGAAAIT^ 

AATAGTGAGGGCCCAAACCGACACCAAAACCTTTCATGAGGCAATAAGAG^^ 

AGAT AGAC AAGAGTGAAAACCGGCAAAATCCAGAATTGCACAACAAAm 

GAGATTTTCCACACGATAGCCCAACCCACCCTGAAACACACCTACGGTGAGGT ^ 

GACGTGGGAGCAACTTGAGGCGGGGATAAATAGAAAGGGGGCAGCAGGCrTC 3 

CTGGAGAAGAAGAACATCGGAGAAGTATKXjATTCAGAAAAGCACCT^^ ^ 

ACAATrGGTC\GGGATCTGAAGGCCXKK3AGAAAGATAAAATATTATGAAA ^ 

CAATACCAAAAAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGA g 

CCTGGTGGTTGAGAAGAGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAA S 

ggctagccatcactaaggtcatgtataactgggtgaaacagcagcccgttgtg a 

attccaggatatgaaggaaagacccccitgttcaacatcm to 

aaggaatgggactcgttcaatgagccagtggccgtaagttttgacaccaaagc 

ctgggacactcaagtgactagtaaggatctgcaacntattggagaaatcca 

aatattactataagaaggagtggcacaagttcattgacaccatcaccgaccaca 

tgacagaagtaccagttataacagcagatggtgaagtatatataagaaatggg 

cagagaqggagcggccagccagacacaagtgctggcaacagcatgttaaatg 

TCCTX}ACAATGATGTAa}CXnTCI^ 

TTCAACAGGGTGGCAAGGATCCACGTCTGTGGGGATGATGQOT 

TOAAAAAGGGTTAGGGCTGAAATTTGCTAACAAAC^ 

AAGCAGGCAAACCTCAGAAGATAACGGAAGGGGAAAAGATGAAAGTK^ 

AGATTTGAGGATATAGAGTTCTGTTCTCATACCCCAGTCCCTGTTAGGTGGTCC 

GACAACACCAGTAGTCACATGGCCGGGAGAGACACCGCTGTGATACTATCAAA 

GATGGCAACAAGATrcGArrCAAGTGGAGAGAGGGGTACCACAGCATATG;^ 

AAGCGGTAGC(m'CAGTTTCTTGCTGATGTATrCCT 

GGATTTGCCTGTTGGTCXTITCGCAACAGCCAGAGACAGAC(XATCAAA^ 

CCACTTATTATTACAAAGGTGATCCAATAGGGGCCTATAAAGATGTAATAGGTC 

GGAATCTAAGTGAACTGAAGAGAACAGGCITrGAGAAATrGGCAAATCT/^ 

CTAAGCCTGTCCACGTTGGGGATCTGGACTAAGCACACAAGCAA 

TCAGGACTGTGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTA^ 

ACAGGCTGATATCrAGCAAAACTGGCCACTTATACATACCTCATA^ 

CATrACAAGGAAAGCATTATGAGCAACnX}CAGCrAAGAACAGAGACAA^ 
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GTCATGGGGGTTGGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCTXSCr 

GCTGAG/W^GGTTGAAAATTCTGCTCATGACGGCCGTCGGCGTCAGCAGCTGAg 

acaaaatgtatatattgtaaataaattaatccatgtacAATTCCGCCCCTCTCCCn^ 

TTACTGGCCGAAGCCGCT TGGA ATAAGGCCGGTGTGCGTrrGTCTATATGTTA 

TTTCCACCATATTGCCGTCriTTGGCAjVrGTGAGGC^ 

TCnTCTTGACGAGCATTCCTAGGGGTCTTTCCCCTCrCGCCAA^ 

GTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCTGGAAGCITCTTGAAGACAAA 

CAACGTCTGTAGCQACCCTrTGCAGGCAGCGGAACCCCCCACCTGGCGACAGQ 

TGCCICTGCGGCX:AAAAG<XACXyiX3TATAAGATACACC^^ 

ACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCT 

CCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACCCC^ 

ATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTC^^ 

GTTAAAAAACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCC^^ 

AACACGATGATAAGCITGCCACAAQ:atgaccgagtacaagcccacgg^gcgcctcgccacccgcgac^ 

cg^cccccgggccgtacgcaccctcg(x:gccgcgttcgccgactaccccgcx:acgcgc^ 

gagcgggtcaccgagctgcaagaactcttcctcacgcgcgtcgggctcgacatcggcaaggtgtgggtcgcggacg^ 

gcggtggcggtctggaccacgccggagagcgtcgaagcgggggcggtgttcgccgagatcggcccgcgcatgg 

cggttcccggctggccgcgcagcaacagatggaaggcctcctggcgccgcaccggcxcaaggagccc 

cgtcggcgtctcgcccgaccaccagggcaagggtctgggcagcgccgtcgtgctccccggagtggaggcgga:^ 

gggtgcccgccttcctggagacctccgcgccccgcaaixtccccttc^ 

gcccgaaggaccgcgcgacctggtgcatgacccgcaagcccggtgccTGAcgcxcgccccacgacccgcagcg 

aaaggagcgcacgaccccatgaaATGCATCGATCGTACGAATTAACGCCGACAGGCTGATAT 

CCAGCAAAACTGGCCACITATAQ^TACCTGATAAAGGCmACAT^^ 

AGCATTATGAGCAACraCAGCTAAGAACAGAGACAAACCCGGTCATGGGC^^ 

GGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCnXKTGCTGAG^ 

GAAAATTCTGCTCATGACGGCCGTCXXK:GTCAGCAGCrGAgacaaa^ 

tcaagattatctacctcaagataacactacatttaatgcacacagcactttagctg 
gaagacctctaacagccccc 
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Gtatacgagaattagaaaaggcactcgtatacgtattgggcaattaaaaataataattaggcstaggtacatggcacg^ 

gatgggggcga ca ctccaccatgaatcactexactgtgaggaactac^tcttcacgcagaM 

tgtcgtgcagcctocaggac(xcccctcccgggagagccaUgtggtctgcggaaa;ggtgagtacaccg^ 

cgggtcctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaagactgctagccgagt^^ 

aggccttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATGAGCACGAATC 

CTAAACCrCAAAGAAAAAaL\AACX3TAACA(XAACXX3TCGC^ 

AAGTTCCCGGGTGGCGGrCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAG 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAGCGGTCGCAA 

CCTCGAGGTAGACGTCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGGA 

CCTGGGCrCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGTTGCGGG 

TGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGGCCTAGCTGGGGCCCCAC 

AGAOCXXXXKjCXjrAGGTCGCGCAATTTGGGTAAGGTCATCGATACCCrrAC^ 

GiXGCTTCGCCGACCTCATGGGGTACATArcGCrCGTCGGCXKrCC^^ 

GGCGCTGCCAGGGCCCTGGCGCATGGCGT CCGG GTTCTGGAAGACGGCGTGA 

ACTATGCAACAGGGAACCTTCCTGGTTGCTCTITCICTATCnT^^ 

GCTCTCTTGCCTGACCGTGCCCGCITCAGCCTACCAAGTGCGCAATTCCT^^ 

GCTITACCATGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGCGGC 

CGATGCCATCCTGCACACTCCGGGGTGTGTCCCITGCG1TCGCGAGGGTAACX3 

CCTCGAGGTGTTGGGTOGCGGTGACCCCCACGGTGGCCACCAGGGACGGCAA 

ACrCCCrACAACGCAGCIT03ACGTCATATCGATCTGCITGTC GGGAG CGCCA 

CXXnCroCTCGGCCXnxn'ACGTGGGGGACCTCTGCGGGTCTGT^^ ^ 

GTCAACnjriTACCTKnCTCCCAGGCGCCACTGGA(X}AC^^ 3 

GTTCTATCTATCCOjGCCATATAACGGGTCATCGCATGGCATGG^ m 

TGAACTGGTCCCCTACGGCAGCXJTroOTGGTAGCrcAGCraCTCCXKSATCC^ U 

CAAGCCATCATGGACATGATCXjCTGGTGCTCACTGGGGAGTCCTGGCGGGCAT « 

AGCGTATTTCTCCATGGTGGGGAACTGGGOSAAGGTCCTGGTAGTGCTGCTGC S 

TATTTGCCGGCGTCGACGCGGAAACXrCACGTCACCGGGGGAAGTGCCGGCCG 5 

CACCACGGCTGGGCTTGTTGGTCrCXnTACACCAGGCGCCAAGCAGAACATCC ^ 

AACTXjATCAACACCAACGGCAGTKKKIACATCAATAGCAa^ 

AATGAAAGCCTrAACACCGGCrGGrTAGCAGGGCTCTrCrATCAGCACAA ATrC 

AACTCrrCAGGCrGTCCTGAGAGGTTGGCCAGCraCCGACGCCITACCGATT^ 

GaX:AGGGCIXKXK3TCCTATCAGTTATGCCAACGGAAGCGGCCTCGACGAAC 

GCrCCTACTGCTGGCACTACCCKX^AAGACCTTGTGGCAT^^ 

AGCGTGTGTGGCCCGGTATATTGCITCACTC(XAGCCCCXJnKJTGGTGC^ 

GACXXjACAGGTCGGGCGCGCCTACCTACAGCTGGGGTGCAAATGATACGGAT 

CTCTTCGTCCITAACAACA(XAGGCCA(XGCTGGGCAATTGGTTCGGTTGTArc 

TGGATOAACTCAACrGGATTCACCAAAGTGTGCGGAGCGCCCCCTTGTGTCAT 

CGGAGGGGTGGGCAACAACACCTTGCTCTGCCCCACTCATTGTTTCCGCAAGC 

ATCCGGAAGCCACATACTCTCGGTGCGGCrCCGGTCCCTGGATTACACCCAGG 

TGCATGGTCGACTACCCGTATAGGCTTIXjGCACTATCCTTGTACCATCAA^ 

ACCATATTCAAAGTCAGGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAG 

CGGCCTGCAACTGGACGCXXKSGCGAACGCTGTGATCTGGAAGACAGGGACAG 

GTCCGAGCTCAGCC(L^TIXX^XX^t}TCCACCACACAGTGGCA 

GTTCrmiACGACCCrGCXAGCCTroTCCACCGGCCT^ 

ACATTGTOGACGTGCAGTACTTGTACXKKKjTAGGGTCAAGCATCOCXjTCCT^ 

GCCATTAAGTOGGAGTACGTCGTTCTCXnXjTrCCrcCTGCITGCAGACGCG^ 

GTCTGCrCCTGCTTGTGGATGATGTTACTCATATCCCAAGCGGAGGCGGCnTG 

GAGAAOH'CGTAATACTCAATGCAGCATCCCrGGCXrGGGACGCACGGTCrrGT 

GT(XTrcCTCGTGTTCnTCTGCTTIXKX!TGGTATCTC 
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CGGAGCGGTCTACGCCITCTACGGGATGTGGCCrCTCCTCCTGCTCCTGCT 

CGTTGCCTCAGCGGGCATACGCACTGGACACGGAGGTGGCCGCGTCGTGTGG 

CGGCGTTGTTCnTGTCGGGTTAATGGCGCTGACTCTGTCGCCATATTACAAGCG 

CTACATCAGCTGGTGCATGTGGTGGCITCAGTATTrrCTGACCAGAGTAGAAGC 

GCAACTGCACGTGTGGGTTCCCCCCCTCAACGTCCGGGGGGGGCGCGATGCC 

GTCATCTTACTCATGTGTGTTGT ACACC CGACTCTGGTATTTGACATCACCAAAC 

TACTCCrGGCCATCTTCGGACCCCTTTGGATTCITCAAGCCAGT^ 

CCCCTACTTCGTGCGCGTTCAAGGCCTTCTCCGGATCTGCGCGCTAGCGCGGA 

AGATAGCCGGAGGTCATTACGTGCAAATGGCCATCATCAAGTTAGGGGCGCTT 

ACTGGCACCTATGTGTATAACCATCTCACCCCTCTTCGAGACTGGGCGCACAAC 

GGCCTGCGAGATCTGGCCGTGGCTGTGGAACCAGTCGTCTTCTCCCGAATGGA 

GACCAAGCTCATCACGTGGGGGGCAGATACCGCCGCGTGCGGTGACATCATC 

AACGGCTTGCCCGTCTCTGCCCGTAGGGGCCAGGAGATACTGCTTGGGCCAGC 

CGACGGAATGGTCTCCAAGGGGTGGAGGTTGCTGGCGCCCATCACGGCGTAC 

GCCCAGCAGACGAGAGGCCTCCTAGGGTGTATAATCACCAGCCTGACTGGCCG 

GGACAAAAACCAAGTGGAGGGTGAGGTCCAGATCGTGTCAACTGCTA(XC/^ 

CCirCCTGGCAACGTGCATCAATGGGGTATGCrGGACTGTCTACC^ 

GGAACGAGGACCATCGCATCACCCAAGGGTCCTOTCATCrAGATGT^^ 

TGTGGACXAAGACCTTGTQGGCroGCCCGCTCCTCAAGGTTC 

CACCCTGCACCnXXKKnx:CTCGGACCTrrACCTC^ 

GTCATTCCCGTGCGCCGGCGAGGTGATAGCAGGGGTAGCCTGCTTTCGCCCCG 

GCCCATrrCCTACTTGAAAGGCTCCrCGGGGGGTCCGCTG^ 

GACACGCCG TGGG CCTATTCAGGGCCGCGGTGTGCACCCGTGGAGTGGCTAA 

GGCGGTGGACTTTATCCCTGTGGAGAACCTAGAGACAACCATGAGATCCCC^ 

TGTTCACGGACAACTCCTCTCCACCAGCAGTGCCCCAGAGCTTCCAGGTGGCC 

CACCTGCATGCTCCCACCXSGCAGCGGTAAGAGCACCAAGGTCCCXjGCrGCQTA 

CGCA GCCC AGGGCTACAAGGTGTTGGTGCTCAACCXCTCTGTTGCT^ 

TOQGCTTIX3GTGCTTACATGTCCAAGGCCCATGGGGT^ 

CCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACCTAC^^ 

AAGTTCXTTGCCGACGGCGGGTGCrCAGGAGGTGCTTATGACATAATAATT^ 

GACGAGTGCCACTCCACGGATGCCACATCCATCrrGGGCATCGGCACTGTCCT 

TGACCAAGCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACTGCTACC 

CCTCCGGGCTCCGTCAC TGTGTC CCATCCTAACATCGAGGAGGTTGCTCTGTCC 

ACCACCGGAGAGATCCCCTTTTACGGCAAGGCTATCCCCCrCGAGGTGATCAA 

GGGGGGAAGACATCrCATCTTCTGCCACTCAAAGAAGAAGTGCGACGA 

CCGCGAAGCTGGTCGCArrGGGCATCAATGCCGTGGCCTACTACCGCGGTCTT 

GACGTGTCTGTC ATCC CGACCAGCGGCGATGTTGTCGTCGTGTCGACCGATGC 

TCnx:ATGACrGGCmACCGGCGACrKXjACTCT 

TGTCACTCAGACAGTCGATITCAGCCTTGACCCTACCTIT^ 

CACX3CTCCCX!CAGGATGCTGTCnx:CAGGACT^ 

AGGGGGAAGCCAGQCATCrACAGATTTGTGGCACCGGGGGAGCGCCCCTCCG 

GCATGTTCGACrCGTCCGTCCTCTGTGAGTGCTATGACGCGGGCrGTGC^ 

TATGAGCrCACGCCCGCCGAGACTACAGTTAGGCTACGAGCGTACATGAACAC 

(XCGGGGCTTCCCGTGTGCCAGGACCATCITGAATT^ 

(XKlGCCrCACTCATATAGATGCCCACTTTCTATCCCAGACAA^ 

GAGAACmCCnTACCrGGTAGCGTACCAAGCCACCGTGTGCGCTAGGGCT 

AGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATCCGCCTrAAAC 

CCACCCTCCATGGGCCAACACCCCTGCTATACAGACrGGGCGCTGTTCAGAAT 

GAAGTCACCCTGACGCACCCAATCACCAAATACATCATGACATGCATGTCXX^ 

GACCTGGAGGTCGTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTG 

CTCTGGCCGCGTATTGCCTGTCAACAGGCTGCGTGGTCATAGTGGGCAGGATT 

GTTCGATGAGATGGAAGAGTGCTCTCAGCACTTACCGTACAT<XjA(^ 
TGATGCTax:TGAOCAGTTCAAGCAGAAGGCCCTCGGCCTCCTOCAGACC^ 
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TCCCGCCAAGCAGAGGTTATCACCCCTGCTGTCCAGACCAACTGGCAGAAACT 
CGAGGTCTTCTXjGGCGAAGCACATGTGGAATTTCATCAGTGGGATACAAT AOT 

ggcgggcctgtcaacgctgcctggtaaccccgccattgcttcattgatggc^ 

tacagctgccgtcaccagcccactaaccactggccaaaccctcctcttcaacat 

attgggggggtgggtggctgcccagctcgccgcccccggtgccgctaccgcc 

tttgtgggcgctggcttagctggcgccgccatcggcagcgttggactgggga 

aggtcctcgtggacattcttgcagggtatggcgcgggcgtggcgggagctct 

tgtagccrrcaagatcatgagcggtgaggtccccrccacggaggacctggtca 

atctgctgcx:cgccatcctctcgcctggagccci^ 

gcagcaatactgcgccggcacgttggcccgggcgagggggcagtgcaatgga 

TGAACCGGCTAATAGCCITCGCCTCCCGGGGGAACCATGTTTCCCCCACGCAC 

TACGTGCCGGAGAGCGATGCAGCCGCCCGCGTCACTGCCATACTCAGCAGCCT 

CACTGTAACCCAGCTCCroATcgCrAGaccatggggtaccgagC GTTA CTC^ 

GCITGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTT^ 

GTCITITGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCm^ 

TTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTG^^ 

TGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTC^^ 

ACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCA 

AAAGCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACG 

TTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCIXnX^ 

ACAAGGGGCTGAAGGA TGCCC AGAAGGTACrCCATTGTATGGGATCTGATCTG 

GGGCCTCGGTGCACATGCTTTACA TGTGTTTAGT CXjAGGTTAAA^ 

gccccxx:gaaccacgggg acgtgg ttitcctitgaaaaacacgatgataa 

GGAGTTGATCACAAATGAAariTrATACAAAACATAC^^ 

GGTGGAGGAACCTGTTTATGATCAGGCAGGTGATCCCTTATTT^ 

GAGCAGTCCACCXTCAATCGACGCTAAAGCTCCCACACAAGAGAGGGGAACXjC ^ 

GATGTrCCAACCAACTTGGCAT(XTrACCAAAAAGAGGTGACT^^ vi 

TAATAGCAGAGGACCTXjTGAGCGGGATCTACCTGAAGCCAGGGCCAC^^ <^ 

ACCAGGACTATAAAGGTCCCGTCTATCACAGGGCOCXXjCrGGAGCTCm 9 

GAGGGATCCATGTGTGAAACGACTAAACGGATAGGGAGAGTAACTGGAAGTG g 

ACGGAAAGCTGTACCACATITATGTGTGTATAGATGGATGTATAATAATAAAAA g 

GTGCCACGAGAAGTTACCAAAGGGTGTTCAGGTGGGTCCATAATAGGCTTGAC g; 

TGCCCTCTATGGGTCACAAGTTGCTCAGACAC GAAAG AAGAGGGAGCAACAaag * 

cttGCATTGTTGGCCTGGGCAATAATAGCTATAGTTITGTTrCAAGTrA 

AGAAAACATAACACAGTGGAACctgcagfTGGTTTGACCTGGAGGTGACTGAC^ 

CACCXKJGATTACTTCGCTGAGTCCATATTAGTGGTGGTAGTAGCCCTCT^^ 

GGCAGATATGTACTITGGTTACnX3GTTACATACATGGT(^ 

GCCTTAGGGATTCAGTATGGATCAGGGGAAGTGGTGATGATGGGCAAOT 

AACCXV^TAACAATATrGAAGTGGTGACATACrrCTrGCrGCTC 

GAGGGAGGAGAGCGTAAAGAAGTGGGTCITACTCTrATACCACATC^ 

TACACXXL\ATCAAATCK}TAATTGTGATCCTACnXJATC 

AGGCCGATTCAGGGGGCCAAGAGTACITGGGGAAAATAGACCTCTGTm 

ACAGTAGTACTAATOjTCATAGGTITAATCATAGCrAGGCGTGACCCAACTAT^ 

GTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACCCACCA 

GCCTGGAGTTGACATCGCTGTGGCGGTCATGACTATAACCCTACTGATGGTTA 

GCTATGTGACAGATTATTTTAGATATAAAAAATGGTTACAGTGCATT 

GGTATCTGGGGTGTTCrrGATAAGAAGCCTAATATACCTAGGTAGAATCGAGAT 

GCCAGAGGTAACTATCCCAAACTGGAGACCACTAACnTTAA^^ 

ATXnCAACAACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCCT 

GCAATGTGTGCCTATCTTATTGCTGGTCACAACCnTGTGQGC^ 

CCTAATACTGATCCTGCCTACCTATGAATTOGTr/^ 

GTTAGGACrGATATAGAAAGAAGTTGGCTAGGGGGGATAGA CTATACAA GAGT 

TGACTCCATCTACGACGTTGATGAGAGTGGAGAGGGCGTATATCTTTI^ 

AAGGCAGAAAQCACAGGGGAATTITICTATA 
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ACTOAT AAGT TOCCTCAGCAGTAAATGGCAGCTAATATACATGAGTTACTTAAC 

TTTGGACTrTATGTACTACATGCACAGGAAAGTTATAGAAGAGATCTCAGGAGG 

TACCAACATAATATCCAGGTTAGTGGCAGCACTCATAGAGCTGAACTGGTCCAT 

GGAAGAAGAGGAGAGCAAAGGCTrAAAGAAGTTTTATCTATTGTCTGGAAGGT 

TGAGAAACCTAATAATAAAACATAAGGTAAGGAATGAGACrGTGGCITCTrGGT 

ACGGGGAGGAGGAAGTCTACXjGTATGCCAAAGATCATGACTATAATCAAGGCC 

AGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGAGGGCCG 

AGAGTGGAAAGGTGGCACCTGC CCAA AATGTGGACGCCATGGGAAGCCX3ATA 

acx3t(mkjgatgtcgctagcagattltgaagaaaga^ 
ataagggaaggcaactttgagggtatgtgcagccgatgccagggaaagcata 
ggaggtttgaaatggaccgggaacctaagagtgccagatactgtgctgagtgt 
aataggcrgcatcxnxxrr gagg aaggtgacttttgggcagagtcgagcatgtt 

GGGCCTCAAAATCACCTACTTTGCGCTGATGGATGGAAAGGTGTATGATATCAC 

AGAGTGGGCTGGATGCCAGCGTGTGGGA ATCTC CCCAGATACCCACAGAGTCC 

CTIXjTCACATCTCATTTGGTTCACGGATGCCTTTCAGGCAGGAATACAATGGCT 

TTOTACAATATACX^GCTAGGGGGCAACTATTTCTGAGAAACTraCCCGTACT^ 

CAACrAAAGrAAAAATGCnt:ATGGTAGGCAACCnTGGAGAAGAAATI^^ 

TGGAACATCITGGGTGGATCCTAAGGGGGarKK:CGTGTGTAAGAAG^ 

GAGCACX3AAAAATGCCACATTAATATACrGGATAAACTAACCGCATTr^^ 

ATCATGCXIAAGGGGGACTACACCCAGAGCCXXXKjTGAGGTrCC^ 

ACTAAAAGTGAGGAGGGGTCnX3GAGACrGGCTGGGCTrACACACACCAAGGC 

GGGATAAGTTCAGTa}Aa:ATGTAACCGCCGGAAAAGATCTACTGQTCT 

CAGCATGGGACGAACrAGAGTGGTTTGCCAAAGCAACAACAGQTTGACCXjATG 

AGACAGAGTATGGCGTCAAGACTGACrCAGGGTGCCCAGACXiGTGCCAGATG 

TTATGTGTTAAATCCAGAGGCCGTTAACATATCAGGATCCAAAGGGGCAGTCGT 

TCACCTCCAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCAGGCACAC 

(XGClllU14'CGACCTAAAAAACTIX5AAAGGATGGTCAGGCrrGCCTATATlTC T 

AAGCCTCCAGCXKX}AGGGTGGTTGGCAGAGTCAAAGTAGGGAAGAATGAAGA 

GTCTAAACCTACAAAAATAATGAGTGGAAT(XAGACCGTCTCAAAAAACAC^ U 

AGA(XTGACCGAGATGGTCAAGAAGATAACCAGCATGAACAGGGGAGACrTCA 2 

AGCAGATTACTTOKjCAACAGGGGCAGGCAAAACCACAGAACTCCCAAAAGCA P 

GTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTrCTTATACCATTAAGG O 

GCAGCX3GCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCCAAGCATCTC fa 

TnTAACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAACCGGGATA 

ACCTATGCATCATACGGGTACnrCTGCCAAATGCCrcAACCAAAGCnx:^ 

GCTATGGTAGAATACTCATACATATTCTTAGATGAATACCATTGTGCCACTCCTG 

AACAACTGGCAATrATCXXX}AAGATCCACAGATrTTCAGAGAGTATAAGGGTr 

GTCXX:CATGACTGCCACGCCAGCAGGGTCX3GTGACG^CAACAGGTCAAAAGC 

ACOCAATAGAGGAAlTCATAGCCmXSAGGTAATGAAAGGGOAGGATCTTGGT 

AGTCAGTTCCrrOATATAGCAGGGTTAAAAATACCAQTGOATQAOATGAAAGO 

CAATATGTTGGTITITGTAa::AACGAGAAACATGGCAGT 

AGCTAAAAGCTAAGGGCTATAACTCTGGATACTATTACAGTGGAGAGGATOCA 

G<X:AATCroAGAGTrGTGACATCACAATCCXXXn"ATGTAATCGTGC^ 

GCrATTGAATCAGGAGTGACACTACX:AGATTTGGACACGGTTATAGACACGGG 

GTraAAATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCrrCATCGTAA 

CAGGCCTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCGTAGGGG 

CAGAGTAGGTAGAGTGAAACXXXXX3AGGTATrATAGGAGCCAGGAAACAGCA 

AQ\GGGTCAAAGGACrACX:ACrATGACCTCITGCAGGCACAAAGATACGGGAT 

TGAGGATGGAATCAACXnUACGAAATCCmAGGGAGATGAATTACGATrGGA 

GCCTATACGAGGAGGACAGCCTACTAATAACCCAGCrGGAAATACrAAATAATC 

TACrcATCrCAGAAGACTTGCCAGCCXXnX3TTAAGAAC^^ 

ATCACCCAGAGCCAATCCAACTTGCATACAACAGCTATGAAGTCCAGGTCCCG 

GTCCK3TrCCCAAAAATAAGGAATGGAGAAGTCACAGAQ\CCTACGAAAATTAC 

TCXnTICTAAATGCCAGAAAGTTAGGGGAGGATGTGCCCX3TGTATATCTACX3Cr 
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ACTGAAGATGAGGATCTGGCAGTTGACCTCITAGGGCTAGACTGGCCTGATCX: 

TGGGAACCAGCAGGTAGTGGAGACTGGTAA AGCA CTGAAGCAAGTGACCGGG 

TTGTCCTCXjGCTGAAAATGCCCTACTAGTGGCTITATTTGGGTATGTGGGTTAC 

CAGGCTCTCrCAAAGAGGCATGTCCCAATGATAACAGACATATATACXIATCGAG 

GACCAGAGACTAGAAGACACCACCCACCTCCAGTATGCACCCAACGCCATAAA 

AACCGATGGGACAGAGACTGAACTGAAAGAACTGGCGTCGGGTGACGTGGAA 

AAAATCATGGGAGCCATTTCAGATTATGC AGCT GGGGGACTGGAGTTTX^ 

ATCXXAAGCAGAAAAGATAAAAACAGCr(X'l'llUlTrAAAGAAAACXXL\GAAGC 

cgcaaaagggtatctccaaaaattcattgactcattaatt^ 

aataatcagatatggntgtggggaacacacacagcactatacaaaagcatagc 

tgcaagactggggcatgaaacagcgtttgccacactagtgttaaagtggctag 

Cl ' 1 ' ll GGAGGGGAATCAGTGTCAGACCACGTCAAGCAGGCGGCAGTroATTTA 

GTGGTCTATTATGTGATGAATAAGCCrrCCrrCCCAGGTGACrCCGAGACACAG 

CAAGAAGGGAGGCGATTCGTCGCAAGCCTGTTCATCTCCGCACTGGCAACCTA 

CACATACAAAACTTGGAATTACrACAATCTCrCTAAAGTGGTGGAACCAGCr 

GGCTTACXrrC(XXn'ATGCTAa:AGCGCATTAAAAATGTTCACX:CCAAC^ 

GGAGAGCGTGGTGATACrGAGCACX:ACGATATATAAAACATACXncrCTATAAG 

GAAGGGGAAOAOTGATGOATKXrroGGTACGGGGATAAGTGCAGCrATGGAA 

ATCCTGTCACAAAACCCAGTATCGGTAGGTATATCTGTGATGTTGGGGGTAGG 

GGCAATCXXJI^GCACAACXSCTATTGAGTCCAGTGAACAGAAAAGGACCXTAC 

TTATGAAGGTGTTTOTAAAGAAClTCTTGGATCAGGCKKrAACAGATGAGCrGG 

TAAAAGAAAACm\GAAAAAATTATAATGGCCTrATTTCAAGCAGTCX:AGAC^ 

TTGGTAA(X(XXnX}AGACTAATATACCACCTGTATGGGGTrTACrACAAAGGTT 

GGGAGGCCAAGGAACrATUTOAGAGGACAGCAGGCAGAAACTTATTCACATTG 

ATAATCTTTGAAGCCTTCGA GTTAT TAGGGATGGACTCACAAGGGAAAATAAG "? 

GAACCTGTCCGGAAATTACATITn3GATITGATATA(XK3CCTACACAAC^ S 

CAACAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCX:CTGCA<XXnTrAGTr u 

GTGACTGGACCCCTAGTGACXiAGAGGATCAGATrGCCAAC AGAC AACTATrrG 5 

AGGGTAGAAACCAGGTGCCCATGTQGCTATOAGATGAAAGCnrCAAAAATGT P 

AGGTGGCAAACrTACCAAAGTGGAGGAGAGCGGGCCTTTCCTATGTAGAAACA g 

GACCrGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATTACGATGACAAC (x, 

CrCAGAGAGATAAAACCAGTAGCAAAGTTGGAAGGACAGGTAGAGCACrACTA 

CAAAGGGGTCACAGCAAAAATTOACrACAGTAAAGGAAAAATGCTCITGGCXilA 

CTGACAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAGCTAAGAGATAT 

ACTGGGGTCGGGTTCAATGGTGCATACTTAGGTGACGAGCCCAATCACCGTGC 

TCTAGTGGAGAGGGACKjTGCAACTATAACCAAAAACACL^ 

AATGAAGAAGGGGTGTGCGTTCACCTATGA(XTGACX:ATCrCCAAT^^ 

GGCrCATCXSAACTAGTACACAGGAACAATCTroAAGAGAAGGAAATACCCACX: 

GCTACGGTCACCACATGGCTAGCITACACCrTCGTGAATGAAGACGTAGGGAC 

TATAAAACCAGTACTAGGAGAGAGAGTAAT(XCCGA(XCrGTAGTTGATATCAA 

TITACAACTAGAGGTGCAAGTGGACACXjrCAGAGGTrGGGATCACAATAATrG 

GAAGGGAAACCCTGATGACAACGGGAGTGACACCTGTCITGGAAAAAGTAGA 

GCCTGACXjCCAGCGACAACCAAAACTCGGTGAAGATCXSGGTTGGATGAGGGT 

AATTACCCAGGGOrrGGAATACAGACACATACACTAACAGAAGAAATACACAA 

CAGGGATGCGAGGC(XTTCATCATGATCCrGGGCTCAAGGAATTCCATATCAA 

ATAGGGCAAAGACTGCTAGAAATATAAATCrGTACACAGGAAATGACCCCAGG 

GAAATACXjAGACTTGATGGCTGCAGGGCGCATGTTAGTAG TAGCACT GAGGGA 

TGTCGACCCnXJAGCTGTCnXiAAATGGTCGATrrcAAGGGGACnTm 

GGAGG<XXnX3GAGGCTCTAAGTCTCGGGCAACCTAAACCGAAGCAGGT^ 

AGGAAGCTGTTAGGAATTTGATAGAACAGAAAAAAGATGTGGAGATCCCrAAC 

TGGTTTGCATCAGATGACCCAGTATrrCnXjGAAGTGGCCTrAAAAAATGAT^ 

TACrACTTAGTAGGAGATGTTGGAGAGGTAAAAGATCAAGCTAAAGCACTTGG 

GGCCACGGATCAGACAAGAATTATAAAGGAGGTAGGCrCAAGGACGTATGCCA 

TGAAGCTATCTAGCTCGTTCCTCAAGGCATCAAACAAACAGATGAGT^ 
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CACTGTTTGAGGAATrGTrGCTACXKnXBCCCACCTGCAACrAAGAGCAATAAG 

GGGCACATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAGCCCCTCGG 

TTGCGGGGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAGATACACCCAT 

ATGAAGCITACCTGAAGTrGAAAGATTTCATAGAAGAAGAAGAGAAGAAACCT 

AGGGTTAAGGATACAGTAATAAGAGAGCACAACAAATGGATACrTAAAAAAAT 

AAGGTITCAAGGAAACXTCAACACCAAGAAAATGCrcAACan^ 

tgaacagttggacagggagggga3caagaggaacatctacaaa:acx:7^gatt 

ggtacrataatgtcaagtgcaggcataaggcrggagaaattgccaatagtgag 

ggcccaaaccgacaccaaaacctttcatgaggcaataagagataagatagaca 

agagtgaaaaccggcaaaatcxragaatrgcacaacaaattgtrcgagattttcc 

acacgatagcccaacccaccctgaaacacacctacggtgaggtgacgtgggag 

caacttgaggcggggataaatagaaagggggcagcaggcttcctggagaaga 

agaacatcggag/>u\gtattggattcagaaaagcaccrggtagaacaattggtc 

agggatctgaaggccgggagaaagataaaatattatgaaactgcaataccaaa 

aaatgagaagagagatgtcagtgatgactggcaggcaggggacctggtggtt 

qagaagaggccaagagltatccaatacccrgaagccaagacaaggcragccat 

cactaaggtcatgtataacraggtgaaacagcagccxgttgtgatt^^ 

atgaaggaaagaccoccrrgttcaacatctitgataaagtgagaaaggaa 

gactcgttcaatgagccagtggcastaagttttgacaccaaagcctgggacac 

tcaagtgactagtaaggatctgcaacrratkkjagaaatcxlagaaatatracta 

taagaaggagtggcacaagttcattoacatxatcaccgacxacatgacagaag 

taccagttataacagcagatggtgaagtatatataagaaatgggcagagaggg 

agcggccagccagacacaagtgcrggcaacagcatgttaaatgtcctgacaat 

gatgtacgccitckk:gaaagcacaggggtac(:x}tacaagagtttcaacaggg 

tggcaaggatccacgtctgtggggatgatggcircttaataactgaaaaaggg 

ttagggctgaaatitgctaacaaagggatqcagattctrcatgaag^ 

accrcagaaoataa(xk3aaggggaaaagatoaaaqtrgcctatagatnx3agg 

atatagagttctgttctcataccccagtcccnxjttaggtggtc^ 

gtagtcacatggccgggagagacarcgctgtgatactatcaaagatggcaaca 

agattggattcaagtggagagaggggtaccacagcatatgaaaaagcggtag 

CCTrCAGri'rCl'l'GCTGATGTATTCCTGGAACCCG(mXjTTAGGAGGATTTGCCT 

GTTGGTCCTITCGCAACAGCCAGAGACAGACXCATCAAAACATGCCACITATrA 

TTACAAAGGTGATCCAATA GGGGC CTATAAAGATGTAATAGGTCGGAATCTAA 

GTGAACraAAGAGAACAGGCmGAGAAATKjGCAAATCTAAACCrAAGCCTG 

TCCACGTrGGGGATCrGGACTAAGCAC«iCAAGCAAAAGAATAATrCAGGACTG 

TGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTAA CGCC GACAGGCTGA 

TATCCAGCAAAACTGGCCACrrATACATACXTGATAAAGGCITTACATTA 

GAAAGCATTATGAGCAACTGCAGCrAAOAACAGAOACAAACCCXK3TCATGGGG 

GTTGGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCrrocra^ 

GTTGAAAATTCTGCTCATGAOjGlXGTCXKICGTCAGCAGCroAg 

Mflflt^flgWflafr ratpnriitagtgtfltataaatatagrtgggMXgtMacc tPJng flflgflcgflcacgc^^ 

agtagtcaagattatctacctcaagataacactacafflaatgcacacagcactttagctgtatgaggatacgc^ 
tagggaagacctctaacagoccoc 
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